Skip to content

Conversation

@shanthoosh
Copy link
Collaborator

Problem

The session config spark.sql.iceberg.split-size was honored in some Spark read paths but ignored in others. Specifically, SparkReadConf.splitSizeOption() only checked the read option (SparkReadOptions.SPLIT_SIZE) and not the session config, causing inconsistent behavior: Specifically, APIs such as SparkStagedScan and SparkMicroBatchStream uses the session config(SparkSQLProperties.SPLIT_SIZE), while SparkScanBuilder.configureSplitPlanning() did not respect the session configuration(SparkSQLProperties.SPLIT_SIZE).

Fix

This PR fixes the inconsistency by updating splitSizeOption() to also consider the session configuration (SparkSQLProperties.SPLIT_SIZE), ensuring consistent split-size handling across all Spark reader control flows.

Testing

  1. Added unit tests
  2. Verified with ./gradlew clean && ./gradlew build

@github-actions github-actions bot added the SPARK label Jan 20, 2026
.parse();
}

public Long splitSizeOption() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name does mean that it is reading from Option alone.

Copy link
Collaborator Author

@shanthoosh shanthoosh Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@sumedhsakdeo sumedhsakdeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should take a deeper look at why only options / tableproperty are being looked at in the alternate split size path.

overriding method that is named to look at spark options alone with session conf / tblproperty, etc. seems incorrect.

@shanthoosh shanthoosh closed this Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants