Skip to content

Conversation

@hamersaw
Copy link

@hamersaw hamersaw commented Jan 15, 2026

Previously, "take*" operations did not support _rowid, _rowoffset, _row_created_at_version, and _row_last_updated_at_version. In this PR we add support for all of these columns.

We preserve these system columns through the initial schema projection so that they can be used to populate the correct flags when building the ProjectionPlan and PhysicalProjection structs.

  • _rowid / _rowaddr: persisting these through to ProjectionPlan fields was enough to make them work
  • _rowoffset: required additionally (1) stripping ROW_OFFSET field from ProjectionPlan requested_output_expr and (2) manually injecting column using AddRowOffsetExec (after exposing some methods publicly)
  • _row_created_at_version / _row_last_updated_at_version: required piping through flags to Fragment readers.

Addresses #5615.

Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
@github-actions
Copy link
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@hamersaw hamersaw changed the title bug: support system columns in take_scan fix: support system columns in take_scan Jan 15, 2026
@github-actions github-actions bot added the bug Something isn't working label Jan 15, 2026
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
@codecov
Copy link

codecov bot commented Jan 15, 2026

Codecov Report

❌ Patch coverage is 85.10638% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-core/src/datatypes/schema.rs 84.09% 5 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hamersaw for working on this! Only have a question.

Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
@hamersaw hamersaw marked this pull request as draft January 19, 2026 15:34
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
@hamersaw hamersaw changed the title fix: support system columns in take_scan fix: support system columns in dataset::take* operations Jan 20, 2026
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
@hamersaw hamersaw changed the title fix: support system columns in dataset::take* operations fix: support system columns in dataset.take* operations Jan 20, 2026
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Comment on lines +348 to +354
let mut stripped_projection = projection.as_ref().clone();
stripped_projection.requested_output_expr = stripped_projection
.requested_output_expr
.clone()
.into_iter()
.filter(|e| e.name != ROW_OFFSET)
.collect();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty unsure about this (feels hacky). I first attempted to inject the ROW_OFFSET field into the schema within project_batch and inject the AddRowOffsetExec into the ExecutionPlan but was getting "index out of bounds" with len 2. I think this may be because we expect the ROW_ADDR column (required for ROW_OFFSET compute) but we manually inject it for take* operations.

My other idea is to strip this field from the requested_output_expr in the TakeBuilder when we convert into the ProjectionPlan. I'm unsure of the implications there.

@hamersaw hamersaw marked this pull request as ready for review January 20, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants