Skip to content

Conversation

@michaeljmarshall
Copy link
Member

What is the issue

Fixes: https://github.com/riptano/cndb/issues/16484

What does this PR fix and why was it fixed

When a partition restriction filters down to rows without vectors, we can get a list of 0 rows, and then we attempt to create a bounded heap of size 0, which is invalid. The added test catches this error. The solution is to exit the method early in the event of a zero length SegmentRowIdOrdinalPairs object.

java.lang.IllegalArgumentException: initialSize must be > 0 and < 2147483630; got: 0
	at io.github.jbellis.jvector.util.AbstractLongHeap.<init>(AbstractLongHeap.java:52)
	at io.github.jbellis.jvector.util.BoundedLongHeap.<init>(BoundedLongHeap.java:47)
	at io.github.jbellis.jvector.util.BoundedLongHeap.<init>(BoundedLongHeap.java:43)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.orderByBruteForce(V2VectorIndexSearcher.java:328)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.orderByBruteForce(V2VectorIndexSearcher.java:283)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.searchInternal(V2VectorIndexSearcher.java:246)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.orderBy(V2VectorIndexSearcher.java:172)
	at org.apache.cassandra.index.sai.disk.v1.Segment.orderBy(Segment.java:160)

Avoids this error:

java.lang.IllegalArgumentException: initialSize must be > 0 and < 2147483630; got: 0
	at io.github.jbellis.jvector.util.AbstractLongHeap.<init>(AbstractLongHeap.java:52)
	at io.github.jbellis.jvector.util.BoundedLongHeap.<init>(BoundedLongHeap.java:47)
	at io.github.jbellis.jvector.util.BoundedLongHeap.<init>(BoundedLongHeap.java:43)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.orderByBruteForce(V2VectorIndexSearcher.java:328)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.orderByBruteForce(V2VectorIndexSearcher.java:283)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.searchInternal(V2VectorIndexSearcher.java:246)
	at org.apache.cassandra.index.sai.disk.v2.V2VectorIndexSearcher.orderBy(V2VectorIndexSearcher.java:172)
	at org.apache.cassandra.index.sai.disk.v1.Segment.orderBy(Segment.java:160)
@github-actions
Copy link

github-actions bot commented Jan 20, 2026

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@sonarqubecloud
Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2202 rejected by Butler


3 regressions found
See build details here


Found 3 new test failures

Test Explanation Runs Upstream
test_unicode.TestCqlshUnicode.test_unicode_identifier (python3.8.jdk11.no-cython.x86_64) NEW 🔴 1 / 20
o.a.c.index.sai.cql.VectorCompaction100dTest.testZeroOrOneToManyCompaction[db true] REGRESSION 🔴 0 / 20
o.a.c.index.sai.cql.VectorSiftSmallTest.testMultiSegmentBuild[db false] REGRESSION 🔴 0 / 20

No known test failures found

Copy link

@ekaterinadimitrova2 ekaterinadimitrova2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All brute force code paths are now adequately protected:

✅ The fix in this commit protects the main entry point (V2VectorIndexSearcher.java:272)
✅ The two overloaded variants in V2VectorIndexSearcher are indirectly protected (only callable after the size check)
✅ Both VectorMemtableIndex brute force paths already have explicit empty checks at their call sites

There is one cqlsh test failure that seems new but it also seems unrelated. Please test it, just in case, locally and we can open a followup ticket if it is confirmed unrelated. The other PR also looks good.

@michaeljmarshall
Copy link
Member Author

There is one cqlsh test failure that seems new but it also seems unrelated. Please test it, just in case, locally and we can open a followup ticket if it is confirmed unrelated. The other PR also looks good.

The cqlsh failure is from dtest and is not related to this PR:

self = <cqlshlib.test.test_unicode.TestCqlshUnicode testMethod=test_unicode_identifier>

    def test_unicode_identifier(self):
        col_name = 'テスト'
        with testrun_cqlsh(tty=True, env=self.default_env) as c:
>           c.cmd_and_response('ALTER TABLE t ADD "%s" int;' % (col_name,))

test/test_unicode.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
test/run_cqlsh.py:356: in cmd_and_response
    output = self.read_to_next_prompt()
test/run_cqlsh.py:345: in read_to_next_prompt
    return self.read_until(self.prompt, timeout=timeout, ptty_timeout=3, replace=[DEFAULT_SMM_SEQUENCE,])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <cqlshlib.test.run_cqlsh.CqlshRunner object at 0x7f8ffb7ea160>
until = re.compile('\n(\\S+@)?cqlsh(:\\S+)?> '), blksize = 4096, timeout = 10.0
flags = 0, ptty_timeout = 3, replace = ['\x1b[?1034h']

    def read_until(self, until, blksize=4096, timeout=None,
                   flags=0, ptty_timeout=None, replace=[]):
        if not isinstance(until, Pattern):
            until = re.compile(until, flags)
    
        cqlshlog.debug("Searching for %r" % (until.pattern,))
        got = self.readbuf
        self.readbuf = ''
        empty_reads = 0
        with timing_out(timeout):
            while True:
                val = self.read(blksize, ptty_timeout)
                for replace_target in replace:
                    if (replace_target != ''):
                        val = val.replace(replace_target, '')
                cqlshlog.debug("read %r from subproc" % (val,))
                if val == '':
                    empty_reads += 1
                    if empty_reads > 1:
>                       raise EOFError("'until' pattern %r not found" % (until.pattern,))
E                       EOFError: 'until' pattern '\n(\\S+@)?cqlsh(:\\S+)?> ' not found

test/run_cqlsh.py:263: EOFError

I'm not sure how to run that dtest suite locally. I don't see how this could be related since the python test does not appear related to vector. I am not concerned by this test and feel comfortable merging without re-testing. Would you prefer that I at least re-run the tests on this PR to see if it fails again?

@ekaterinadimitrova2
Copy link

ekaterinadimitrova2 commented Jan 21, 2026

I also don't expect it to be related but as cassandra keeps on surprising me I quickly ran it locally and it passes with your branch. Opened a ticket for flaky test - #16502

Just FYI:

I'm not sure how to run that dtest suite locally.

There is a ReadMe - https://github.com/datastax/cassandra/blob/main/pylib/README.asc

But more or less:

  • enter the pylib dir
  • run ./cassandra-cqlsh-tests.sh. It will spin up virtual env and everything for you and then run the cqlshlib tests. The run takes less than 5 minutes.

@michaeljmarshall
Copy link
Member Author

Thanks @ekaterinadimitrova2! Now I know for next time

@michaeljmarshall michaeljmarshall merged commit 336d180 into main Jan 21, 2026
490 of 500 checks passed
@michaeljmarshall michaeljmarshall deleted the cndb-16484 branch January 21, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants