fix(table): prevent index out of range error in buildManifestEvaluator #692

ErenDursun · 2026-01-22T13:30:56Z

fixes #691

Previously, the buildManifestEvaluator method directly accessed the PartitionSpecs array by index without validating that the requested specID actually exists in the partition specs. This could cause an index out of bounds error.

This change adds a proper lookup using slices.IndexFunc to find the partition spec by its ID and returns an error if the spec is not found, improving error handling and robustness.

zeroshade

Would it be relatively easy to construct a contrived test that would have triggered the out of range error? Just to ensure we don't introduce this back later

ErenDursun · 2026-01-23T09:46:29Z

@zeroshade I've added a short unit test.
We also received an answer on apache/iceberg#15108 confirming it is not required that metadata has to contain the complete history of all partition specs. Therefore I also removed the wrapped ErrInvalidMetadata from the error I added - it might have been misleading. Hope you agree.

fixes [[apache#691](https://github.com/ErenDursun/iceberg-go/issues/691)](https://github.com/ErenDursun/iceberg-go/issues/691) Previously, the buildManifestEvaluator method directly accessed the PartitionSpecs array by index without validating that the requested specID actually exists in the partition specs. This could cause an index out of bounds error. This change adds a proper lookup using slices.IndexFunc to find the partition spec by its ID and returns an error if the spec is not found, improving error handling and robustness.

ErenDursun · 2026-01-23T11:02:53Z

Also apache/iceberg#15108 (comment) suggests the partition specs should be tracked in a map<spec id, spec>. But that would necessitate a change in the Metadata interface, which is user-facing and thus a breaking change.

iceberg-go/table/metadata.go

Line 92 in ad086d2

PartitionSpecs() []iceberg.PartitionSpec

zeroshade · 2026-01-23T15:11:26Z

But that would necessitate a change in the Metadata interface, which is user-facing and thus a breaking change.

We are still pre-version 1. Thus I'm okay with a breaking change here. These sort of situations are precisely why we haven't hit version 1.0 and called it stable yet. Just make sure we're very clear in the commit notice about the breaking change (so the release notes make it clear)

zeroshade · 2026-01-23T15:15:00Z

Alternately, we could store it as a map internally and just construct the slice in the function on the fly (maps.Values) which might make sense to avoid the breaking change and ensure that a user can't change the internal map on us

ErenDursun · 2026-01-23T21:01:05Z

Just had a closer look into it and it seems the field Specs is a list

iceberg-go/table/metadata.go

Line 1210 in 3ce3b4b

Specs []iceberg.PartitionSpec `json:"partition-specs"`

because it is defined as such in the Iceberg docs:

partition-specs: A list of partition specs, stored as full partition spec objects.

Changing its type in the commonMetadata struct would interfere with the (un)marshalling logic and changing it in the (user facing) Metadata interface would let it diverge from the Iceberg specs. I could instead add an internal field specsByID map[int]iceberg.PartitionSpec and an internal method partitionSpecsByID() map[int]iceberg.PartitionSpec with lazy-loading logic. That way the methods

iceberg-go/table/metadata.go

Lines 1316 to 1324 in 3ce3b4b

    
           func (c *commonMetadata) PartitionSpec() iceberg.PartitionSpec { 
        
           	for _, s := range c.Specs { 
        
           		if s.ID() == c.DefaultSpecID { 
        
           			return s 
        
           		} 
        
           	} 
        
           	return *iceberg.UnpartitionedSpec 
        
           }

and

iceberg-go/table/metadata.go

Lines 1430 to 1439 in 3ce3b4b

    
           func (c *commonMetadata) checkPartitionSpecs() error { 
        
           	for _, spec := range c.Specs { 
        
           		if spec.ID() == c.DefaultSpecID { 
        
           			return nil 
        
           		} 
        
           	} 
        
           	return fmt.Errorf("%w: default-spec-id %d can't be found", 
        
           		ErrInvalidMetadata, c.DefaultSpecID) 
        
           }

could use the map for better performance.

If we'd also add the new internal lazy-loading method to the Metadata interface, users wouldn't be affected but the method from this PR with the bugfix

func (scan *Scan) buildManifestEvaluator(specID int) (func(iceberg.ManifestFile) (bool, error), error) {
	i := slices.IndexFunc(scan.metadata.PartitionSpecs(), func(spec iceberg.PartitionSpec) bool {
		return spec.ID() == specID
	})
	if i < 0 {
		return nil, fmt.Errorf("partition spec with id %d not found in metadata", specID)
	}

	spec := scan.metadata.PartitionSpecs()[i]

	return newManifestEvaluator(spec, scan.metadata.CurrentSchema(),
		scan.partitionFilters.Get(specID), scan.caseSensitive)
}

could also work with the map. This method here could too:

iceberg-go/table/update_spec.go

Lines 407 to 411 in 3ce3b4b

    
           func (us *UpdateSpec) isNewPartitionSpec(newSpecId int) bool { 
        
           	return !slices.ContainsFunc(us.txn.tbl.Metadata().PartitionSpecs(), func(s iceberg.PartitionSpec) bool { 
        
           		return s.ID() == newSpecId 
        
           	}) 
        
           }

Buuuut... I'm not sure if the minor performance boost justifies the additional lazy-loading complexity 😅 there's the risk of future changes to the code leading to diverging states between the partition-spec list and map. Should I give it a try or do we keep the lists and just prevent the out of range errors for now?

zeroshade reviewed Jan 22, 2026

View reviewed changes

ErenDursun force-pushed the fix-index-out-of-range branch 2 times, most recently from 0737398 to 50e72f9 Compare January 23, 2026 09:39

ErenDursun added 2 commits January 23, 2026 11:18

test(table): add invalid spec ID error handling test

6fb1100

ErenDursun force-pushed the fix-index-out-of-range branch from 4f3e817 to 6fb1100 Compare January 23, 2026 10:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(table): prevent index out of range error in buildManifestEvaluator #692

fix(table): prevent index out of range error in buildManifestEvaluator #692

ErenDursun commented Jan 22, 2026

Uh oh!

zeroshade left a comment

Uh oh!

ErenDursun commented Jan 23, 2026 •

edited

Loading

Uh oh!

ErenDursun commented Jan 23, 2026

Uh oh!

zeroshade commented Jan 23, 2026

Uh oh!

zeroshade commented Jan 23, 2026

Uh oh!

ErenDursun commented Jan 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(table): prevent index out of range error in buildManifestEvaluator #692

Are you sure you want to change the base?

fix(table): prevent index out of range error in buildManifestEvaluator #692

Conversation

ErenDursun commented Jan 22, 2026

Uh oh!

zeroshade left a comment

Choose a reason for hiding this comment

Uh oh!

ErenDursun commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ErenDursun commented Jan 23, 2026

Uh oh!

zeroshade commented Jan 23, 2026

Uh oh!

zeroshade commented Jan 23, 2026

Uh oh!

ErenDursun commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ErenDursun commented Jan 23, 2026 •

edited

Loading

ErenDursun commented Jan 23, 2026 •

edited

Loading