-
Notifications
You must be signed in to change notification settings - Fork 141
fix(table): prevent index out of range error in buildManifestEvaluator #692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
zeroshade
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be relatively easy to construct a contrived test that would have triggered the out of range error? Just to ensure we don't introduce this back later
0737398 to
50e72f9
Compare
|
@zeroshade I've added a short unit test. |
fixes [[apache#691](https://github.com/ErenDursun/iceberg-go/issues/691)](https://github.com/ErenDursun/iceberg-go/issues/691) Previously, the buildManifestEvaluator method directly accessed the PartitionSpecs array by index without validating that the requested specID actually exists in the partition specs. This could cause an index out of bounds error. This change adds a proper lookup using slices.IndexFunc to find the partition spec by its ID and returns an error if the spec is not found, improving error handling and robustness.
4f3e817 to
6fb1100
Compare
|
Also apache/iceberg#15108 (comment) suggests the partition specs should be tracked in a Line 92 in ad086d2
|
We are still pre-version 1. Thus I'm okay with a breaking change here. These sort of situations are precisely why we haven't hit version 1.0 and called it stable yet. Just make sure we're very clear in the commit notice about the breaking change (so the release notes make it clear) |
|
Alternately, we could store it as a map internally and just construct the slice in the function on the fly ( |
|
Just had a closer look into it and it seems the field Line 1210 in 3ce3b4b
because it is defined as such in the Iceberg docs:
Changing its type in the Lines 1316 to 1324 in 3ce3b4b
and Lines 1430 to 1439 in 3ce3b4b
could use the map for better performance. If we'd also add the new internal lazy-loading method to the Metadata interface, users wouldn't be affected but the method from this PR with the bugfix could also work with the map. This method here could too: iceberg-go/table/update_spec.go Lines 407 to 411 in 3ce3b4b
Buuuut... I'm not sure if the minor performance boost justifies the additional lazy-loading complexity 😅 there's the risk of future changes to the code leading to diverging states between the partition-spec list and map. Should I give it a try or do we keep the lists and just prevent the out of range errors for now? |
fixes #691
Previously, the buildManifestEvaluator method directly accessed the PartitionSpecs array by index without validating that the requested specID actually exists in the partition specs. This could cause an index out of bounds error.
This change adds a proper lookup using slices.IndexFunc to find the partition spec by its ID and returns an error if the spec is not found, improving error handling and robustness.