Skip to content

Conversation

@hevinhsu
Copy link
Contributor

What changes were proposed in this pull request?

Add S3 request body SHA-256 verification to support single-chunk payload validation.

Please describe your PR in detail:

  • Compute the SHA-256 hash of the request body and verify it matches the x-amz-content-sha256 header.
  • Validate x-amz-content-sha256 for GET requests when the header is present.
  • Add unit tests, integration tests, and end-to-end tests.
    • Introduce a new Python helper function to generate PUT presigned URLs, since the AWS CLI only supports generating GET presigned URLs.

What is the link to the Apache JIRA

HDDS-13668

How was this patch tested?

https://github.com/hevinhsu/ozone/actions/runs/19321794081

This patch also tested using the MinIO Mint S3 compatibility suite, focusing only on the test cases that previously failed in HDDS-12871.

After applying this patch:

  • aws-sdk-go: Now passes after this update. The previous unmarshalling xml failed message (introduced on 2025/Oct/03) no longer appears.
  • aws-sdk-ruby: This test case still fails. See comment below for detailed analysis.

@hevinhsu
Copy link
Contributor Author

Investigation: why aws-sdk-ruby test still fails

In the AWS documentation — for example, IAM, S3 query parameter and S3 Signature Calculations for the Authorization Header — it is stated that all x-amz-* headers must be included in the signature calculation.

However, in the Mint Ruby testcase,it appears that the x-amz-acl header is not added to the signature calculation.
As a result, the test currently fails.

In the MinIO implementation, only the x-amz-meta-* headers are validated to exist in the signedHeaders.
Therefore, the Ruby test — which does not include x-amz-acl in the signed headers — can still pass successfully.

@hevinhsu
Copy link
Contributor Author

During the implementation process, I noticed that Ozone doesn’t seem to validate mismatched values between signed headers and signed query parameters.
According to the AWS spec:

If you add a signed header that is also a signed query parameter, and they differ in value, you will receive an InvalidRequest error as the input is conflicting.

To clarify my understanding: Ozone currently relies only on the query parameters for signature validation and does not compare them against the corresponding request headers. Because of this, a mismatch would never trigger an InvalidRequest error in the current implementation.

Please let me know if I’m missing something.

@ivandika3 ivandika3 added the s3 S3 Gateway label Nov 14, 2025
@ivandika3
Copy link
Contributor

During the implementation process, I noticed that Ozone doesn’t seem to validate mismatched values between signed headers and signed query parameters.

Could you give some examples here? Currently I think we only check that all the headers in X-Amz-SignedHeaders should exist in the request headers. Regarding the actual HTTP header value, I think it should be calculated based on the AWS4 signature?

@hevinhsu
Copy link
Contributor Author

Could you give some examples here? Currently I think we only check that all the headers in X-Amz-SignedHeaders should exist in the request headers. Regarding the actual HTTP header value, I think it should be calculated based on the AWS4 signature?

Hi @ivandika3, I added a test (link) to check if requests with conflicting X-Amz-Security-Token in both header and query are properly rejected. According to the S3 spec, this should fail, but the request currently succeeds. Please let me know if you have any suggestions!

@swamirishi
Copy link
Contributor

@yandrey321 Do you wanna review this change?

@hevinhsu hevinhsu marked this pull request as draft November 19, 2025 10:24
@hevinhsu
Copy link
Contributor Author

hevinhsu commented Dec 5, 2025

Hi @ivandika3,

I tried a new implementation that introduces a runnable preCommit interface to perform validation before commitKey.
This approach avoids adding S3-specific logic into client.io, so the validation can be set from ObjectEndpoint via setPreCommit.

However, the current implementation feels suboptimal. To avoid changing existing method signatures, I throw IllegalArgumentException when validation fails, and then transform that exception into an O3SException in ObjectEndpoint if the error message matches.

I have a couple of questions:

  • Is this design acceptable (i.e., introducing preCommit in KeyOutputStream and exposing it via a setter)?
  • Converting IllegalArgumentException into O3SException feels like an awkward way to handle the error — is there a better approach you’d recommend?

PTAL when you have time, thanks.

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hevinhsu for iterating on this.

Is this design acceptable (i.e., introducing preCommit in KeyOutputStream and exposing it via a setter)?

The preCommit hook is a good approach. We can maybe make it an array/list to accomodate additional precommit validation like in #8506, but we can address this in the future.

However, the current implementation feels suboptimal. To avoid changing existing method signatures, I throw IllegalArgumentException when validation fails, and then transform that exception into an O3SException in ObjectEndpoint if the error message matches.

Yes, this seems a bit awkward and prone to error. Instead of Runnable, we might be able to use Ratis CheckedRunnable<IOException> as the preCommit (similar to Validator interface in XceiverClientSpi)? Afterwards, we can simply throw OS3Exception that will be automatically handled by OS3ExceptionMapper?

@hevinhsu
Copy link
Contributor Author

Thanks @ivandika3 for the feedback.
I’ll switch to using a list of preCommit handlers instead and address the remaining follow-up items over the weekend, as I’m a bit busy recently. Sorry for the delay.

we might be able to use Ratis CheckedRunnable<IOException> as the preCommit (similar to the Validator interface in XceiverClientSpi)? Afterwards, we can simply throw OS3Exception that will be automatically handled by OS3ExceptionMapper?

Sorry, I may have misunderstood this point.

I did try a similar approach while exploring the solution, but I ran into a limitation:
OS3Exception does not extend IOException, so it cannot be thrown directly from a CheckedRunnable. It’s possible that I’m not applying this pattern in the intended way.

I also considered introducing a new IOException wrapper to carry an OS3Exception, but that felt somewhat over-engineered, so I decided not to go down that path for now.

I’d appreciate any clarification on how you envisioned using CheckedRunnable<IOException> here.

@ivandika3
Copy link
Contributor

ivandika3 commented Dec 17, 2025

@hevinhsu Yes, you are correct. Since OS3Exception doesn't extend IOException, it's not feasible to use it without doing any extra logic (e.g. wrapping logic you mentioned). Let me think about it.

I wonder whether there will be any effects if we simply change OS3Exception to extends IOException instead, we can then simply throw OS3Exception in preCommit.

@hevinhsu
Copy link
Contributor Author

@ivandika3 Got it — I’ll give this approach a try and evaluate the impact.

This was actually my initial thought when exploring possible solutions. However, after some consideration, I hesitated because I wasn’t fully convinced that OS3Exception semantically fits as an IOException. It represents an S3-level error rather than a pure I/O failure, so extending IOException felt slightly questionable from a modeling perspective, which is why I initially dropped this approach.

That said, given the benefit of simplifying the preCommit flow and avoiding extra wrapping logic, I’m happy to revisit this direction and see how it behaves in practice.

@hevinhsu
Copy link
Contributor Author

Hi @ivandika3,

I’ve updated this PR based on your feedback.

Separately, I also experimented with making O3SException extend IOException, so it can be thrown directly from preCommit.
Initially, this change caused some tests to fail, because there are existing error-handling paths that broadly catch IOException, for example:

After adding this additional handling logic:
https://github.com/hevinhsu/ozone/blob/3d749523500d076ec9f794eb8f57697eabefb62c/hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/BucketEndpoint.java#L509-L511
the tests passed again.(CI)

Due to this behavioral impact and the need to adjust existing exception-handling logic, I did not include the O3SException extends IOException change in this PR.
I think this direction is still possible, but it likely requires a more careful review of the current exception-handling paths before proceeding.

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hevinhsu Thanks for the updating the patch and testing the workarounds. Sorry the review takes a lot longer than expected.

Regarding updating the BucketEndpoint issue, could you try changing the catch (IOException) to catch(OMException) instead and let the other IOException including OS3Exception to IOException.

Another possible alternative is simply make OS3Exception to be unchecked exception (i.e. RuntimeException). This might be more appropriate compared to IOException.

FYI, if you want to go through with the previous wrapping solution, we have HddsClientUtils#containsException to find the exception we want, but might have some other issues.

# Conflicts:
#	hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestObjectPut.java
#	hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestObjectTaggingDelete.java
@hevinhsu
Copy link
Contributor Author

Hi @ivandika3,

Thank you for taking the time to review this fairly large change set and for the detailed suggestions — I really appreciate it.

Regarding updating the BucketEndpoint issue, could you try changing the catch (IOException) to catch (OMException) instead…

I tried changing catch (IOException) to catch (OMException), but this did not work for me.
The root cause of the failing tests comes from the bucket ownership check, where the exception is thrown from S3Owner.
To address this, I added an explicit catch (O3SException ex) block in BucketEndpoint.
With this change, the CI passed. This approach avoids using instanceof checks while still handling the error correctly.

Another possible alternative is simply make OS3Exception to be unchecked (RuntimeException).

I also tried making O3SException an unchecked exception, and so far everything looks good — all tests passed without requiring additional catch logic. (CI)
From an implementation-simplicity perspective, this option seems quite attractive, and I think it’s worth considering.

Wrapper solution / HddsClientUtils#containsException

I also tried the wrapper-based approach by introducing S3IOExceptionWrapper to carry the original O3SException.

I did not end up using HddsClientUtils#containsException, mainly because this solution felt a bit complex for the use case here.
By directly adding a cache(S3IOExceptionWrapper wrapper) block, I was able to achieve the intended behavior in a more straightforward way.(CI)

If there are any edge cases or design considerations that I might have overlooked with this simplified approach, please let me know — I’d be happy to revisit it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s3 S3 Gateway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants