perf: faster PSQL ingestion and queries via parallelism by ranlu · Pull Request #686 · seung-lab/cloud-volume

ranlu · 2026-02-23T04:45:57Z

Improve psql performance for ingestion and query all spatial indices, multithread with optimized psql command, so psql can process data with multiple cores. Extended the query call with a nthread parameter to specify the number of thread used for psql in fast path. The change speed up the spatial index ingestion and mesh merge task generation from > 8 hours to < 2 hours.

…d distinct labels Replace the single-threaded named-cursor SELECT DISTINCT with a parallel range-partitioned approach for the Postgres fast path in query(): - Add PG_RANGE_DISTINCT_SQL template that scans a [low, high) slice of the PK B-tree with hash aggregation for dedup within each range. - Add _parse_pg_binary_copy_bigint() to parse PG binary COPY output for a single BIGINT column into a numpy uint64 array. - Add _pg_parallel_distinct_labels() which splits the label keyspace into 8 non-overlapping ranges via MIN/MAX + np.linspace, queries each on a separate Postgres connection (= separate backend process = separate CPU core) using ThreadPoolExecutor, then concatenates the already-sorted results with no final sort step. - Update query() Postgres fast path to delegate to the new parallel function instead of using a single named cursor. Each worker sets work_mem=256MB to keep hash aggregation in-memory. Ranges are non-overlapping so the concatenated result is globally sorted without an extra sort pass.

…aints and binary COPY Significantly reduce Postgres ingestion time by eliminating the main bottlenecks: incremental B-tree maintenance and text-based COPY. - Defer PK/FK constraints: create file_lookup without PRIMARY KEY or FOREIGN KEY during bulk load. After all data is inserted, add them via ALTER TABLE which builds the B-tree in a single sort+bulk-load pass (~10x faster than incremental insertion). - Replace text COPY with binary COPY: add _build_pg_binary_copy_two_bigints() that constructs the PG binary COPY buffer using a numpy structured array (single tobytes() call). This replaces the old StringIO-based text COPY that did millions of f-string formats and .write() calls. - Vectorize value construction: build flat numpy arrays (labels_arr, fids_arr) instead of a list of (int, int) tuples, reducing memory allocations and GIL contention. - Skip redundant file_lbl index for Postgres: the PK (label, fid) leading column already covers label-only lookups. - Add ANALYZE after index creation to ensure the planner has accurate statistics for hash aggregation. - Remove unused AUTOINC/INTEGER variables from insert_index_files() that were only needed by the old text COPY path.

…st path Add a 'nthread' keyword argument (default=1) to query() that controls the number of threads used by _pg_parallel_distinct_labels(). Print a warning when nthread > 1 but the fast path or Postgres is not in use.

william-silversmith · 2026-02-23T22:48:16Z

This looks pretty good except for a few bits of missing documentation and exclusion of other databased from logic that might benefit them.

Did Claude one shot this?

cloudvolume/datasource/precomputed/spatial_index.py

william-silversmith · 2026-02-23T22:12:18Z

cloudvolume/datasource/precomputed/spatial_index.py

        print("Creating filename index...")
      cur.execute("CREATE INDEX fname ON file_lookup (fid)")

+      if db_type == DbType.POSTGRES:


It looks like MySQL and SQLite also have ANALYZE

https://sqlite.org/lang_analyze.html
https://anotherboringtechblog.com/2024/05/analyze-in-mysql/

Postgres needs ANALYZE to optimize hash aggregation when fetching all indices out of database. ANALYZE TABLE in mysql does not help the performance for our access pattern. Not sure about SQLite

cloudvolume/datasource/precomputed/spatial_index.py

…y_copy_bigint Add a reference to the PostgreSQL COPY file format documentation in the docstring, as requested in PR review.

Replace byte slicing with memoryview slicing to avoid an intermediate copy when parsing PG binary COPY output. np.frombuffer already accepts memoryview, so the body slice is now zero-copy.

Unify the duplicated b'PGCOPY\n\377\r\n\0' magic string into a single PG_BINARY_COPY_SIGNATURE constant to prevent future inconsistencies.

Extend the deferred constraint pattern (previously Postgres-only) to MySQL and SQLite. All DB types now create file_lookup without PK/FK during bulk insert, then add constraints post-load: - Postgres/MySQL: ALTER TABLE ADD PRIMARY KEY + ADD FOREIGN KEY - SQLite: CREATE UNIQUE INDEX (SQLite lacks ALTER TABLE ADD PRIMARY KEY)

The deferred PK/FK constraint creation was running unconditionally, ignoring the create_indices parameter. Move all post-load constraint and index creation under the if create_indices guard.

ANALYZE was previously only run for Postgres. MySQL and SQLite also benefit from updated table statistics for query planning. MySQL uses ANALYZE TABLE syntax; Postgres and SQLite use ANALYZE.

Add a comment explaining that numpy basic slicing returns views (no data copy), and that the actual copy occurs inside _build_pg_binary_copy_two_bigints when building the structured array.

MySQL's ANALYZE TABLE returns a result set (Table, Op, Msg_type, Msg_text columns). mysql.connector requires this to be consumed before calling conn.commit(), otherwise it raises 'InternalError: Unread result found'. Add cur.fetchall() to drain it.

…bigint

The deferred constraint pattern (create table without PK/FK, then ALTER TABLE ADD PRIMARY KEY post-load) causes a ~2x slowdown on MySQL compared to the original inline PK/FK at CREATE TABLE time. Root cause: InnoDB uses a clustered index where the primary key IS the table's physical row storage order. When PK is defined at CREATE TABLE time, InnoDB inserts each row directly into its sorted position in the clustered B-tree -- this is incremental but efficient because the B-tree is maintained as rows arrive. When PK is deferred and added via ALTER TABLE ADD PRIMARY KEY after bulk loading, InnoDB must rebuild the entire table: it copies all rows, sorts them by the new PK columns, and reconstructs the clustered index from scratch. This is effectively a full table copy + sort, which is significantly slower than incremental insertion for large tables. This is the opposite of PostgreSQL, where heap tables are unordered (rows are appended to the heap in insertion order) and the PK is a separate B-tree structure. Building that B-tree in one bulk pass after all data is loaded avoids the overhead of incremental B-tree splits during insertion, and the UNLOGGED table mode eliminates WAL overhead entirely. SQLite also benefits from deferred indexing because its tables use a rowid-based heap similar to PostgreSQL, and CREATE UNIQUE INDEX after bulk load is faster than maintaining a UNIQUE constraint per-row. Also skips ANALYZE for MySQL -- InnoDB's ANALYZE TABLE performs random index dives but does not materially improve query plans for this workload and adds unnecessary overhead. ANALYZE is kept for Postgres and SQLite where it meaningfully helps the query planner. Also skips the redundant file_lbl index for MySQL (the PK leading column already covers label-only lookups, same as Postgres).

ranlu · 2026-02-25T02:17:27Z

@william-silversmith, addressed most of the review points. I tested the deferred constraints approach for mariadb but it made the ingestion a lot slower, so that change was reverted with detailed comments from claude in the commit message. Have not tested sqlite, but I suppose we will not use sqlite to support large dataset anyway.

william-silversmith · 2026-02-25T02:27:53Z

Sqlite can sometimes be used for large datasets fyi. I used it in 2020 to skeletonize minnie65 on Della. Will look when I get a chance!

…

On Tue, Feb 24, 2026 at 9:17 PM ranlu ***@***.***> wrote: *ranlu* left a comment (seung-lab/cloud-volume#686) <#686 (comment)> @william-silversmith <https://github.com/william-silversmith>, addressed most of the review points. I tested the deferred constraints approach for mariadb but it made the ingestion a lot slower, so that change was reverted with detailed comments from claude in the commit message. Have not tested sqlite, but I suppose we will not use sqlite to support large dataset anyway. — Reply to this email directly, view it on GitHub <#686 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSLSR6ADEAURCAGDNWL4NUA43AVCNFSM6AAAAACV4IK2IKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTSNJWGI4TSMZWGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

william-silversmith · 2026-02-25T23:08:16Z

I think this looks good to me. If you have the time to run a sqlite test, that would be great. Otherwise LGTM.

ranlu · 2026-02-26T18:27:47Z

Tested sqlite, does not see a significant change in performance based on 5GB test, took ~10 minutes with or without the pr.

william-silversmith · 2026-02-26T18:39:22Z

Thank you for testing!

ranlu added 3 commits February 22, 2026 12:41

ranlu requested a review from william-silversmith February 23, 2026 04:45

william-silversmith reviewed Feb 23, 2026

View reviewed changes

william-silversmith changed the title ~~Psql perf~~ perf: faster PSQL ingestion and queries via parallelism Feb 23, 2026

william-silversmith assigned ranlu Feb 23, 2026

william-silversmith added the spatial-index Relates to the CloudVolume spatial index usually constructed for meshes and skeletons. label Feb 23, 2026

ranlu added 10 commits February 24, 2026 12:56

docs: add PG binary COPY format documentation link to _parse_pg_binar…

ac9b124

…y_copy_bigint Add a reference to the PostgreSQL COPY file format documentation in the docstring, as requested in PR review.

perf: use memoryview to avoid data copy in _parse_pg_binary_copy_bigint

05fe038

Replace byte slicing with memoryview slicing to avoid an intermediate copy when parsing PG binary COPY output. np.frombuffer already accepts memoryview, so the body slice is now zero-copy.

refactor: extract PG binary COPY signature into a module-level constant

e80fd97

Unify the duplicated b'PGCOPY\n\377\r\n\0' magic string into a single PG_BINARY_COPY_SIGNATURE constant to prevent future inconsistencies.

fix: respect create_indices flag for post-load PK/FK constraint creation

efbc653

The deferred PK/FK constraint creation was running unconditionally, ignoring the create_indices parameter. Move all post-load constraint and index creation under the if create_indices guard.

perf: run ANALYZE for all database types after index creation

1f7125f

ANALYZE was previously only run for Postgres. MySQL and SQLite also benefit from updated table statistics for query planning. MySQL uses ANALYZE TABLE syntax; Postgres and SQLite use ANALYZE.

docs: clarify that numpy slicing in COPY loop is already zero-copy

9ca8b2d

Add a comment explaining that numpy basic slicing returns views (no data copy), and that the actual copy occurs inside _build_pg_binary_copy_two_bigints when building the structured array.

docs: add type annotation to data parameter in _parse_pg_binary_copy_…

b6c38c1

…bigint

william-silversmith merged commit fcc90e3 into master Feb 26, 2026

william-silversmith deleted the psql_perf branch February 26, 2026 18:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: faster PSQL ingestion and queries via parallelism#686

perf: faster PSQL ingestion and queries via parallelism#686
william-silversmith merged 13 commits intomasterfrom
psql_perf

ranlu commented Feb 23, 2026

Uh oh!

william-silversmith commented Feb 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

william-silversmith Feb 23, 2026

Uh oh!

ranlu Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranlu commented Feb 25, 2026

Uh oh!

william-silversmith commented Feb 25, 2026 via email

Uh oh!

william-silversmith commented Feb 25, 2026

Uh oh!

ranlu commented Feb 26, 2026

Uh oh!

william-silversmith commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ranlu commented Feb 23, 2026

Uh oh!

william-silversmith commented Feb 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

william-silversmith Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

ranlu Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranlu commented Feb 25, 2026

Uh oh!

william-silversmith commented Feb 25, 2026 via email

Uh oh!

william-silversmith commented Feb 25, 2026

Uh oh!

ranlu commented Feb 26, 2026

Uh oh!

william-silversmith commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants