Skip to content

Conversation

@phacops
Copy link
Contributor

@phacops phacops commented Jan 30, 2026

Summary

Add infrastructure for using the official clickhouse crate with RowBinary format for improved insert performance. The native client is fully wired up and can be enabled per-storage via configuration.

Key Changes

  • Native client infrastructure (src/strategies/clickhouse/client.rs):

    • InsertableRows trait for type-erased row insertion with type-aware batch merging
    • TypedRows<T> wrapper implementing InsertableRows for any T: clickhouse::Row
    • NativeClickhouseClient with retry logic matching HTTP client behavior
  • Dual-path batch support (src/types.rs):

    • Add typed_rows: Option<Arc<dyn InsertableRows>> to InsertBatch and BytesInsertBatch
    • Type-aware merge in BytesInsertBatch::merge() preserves typed rows when types match
    • InsertBatch::from_rows_with_typed() for processors to provide both JSON and typed rows
  • Native writer step (src/strategies/clickhouse/writer_v2.rs):

    • NativeClickhouseWriterStep uses native client when typed_rows available
    • Falls back to HTTP/JSON when no typed rows (backward compatibility)
  • Configuration (src/config.rs, src/factory_v2.rs):

    • Add use_native_client: bool option to ClickhouseConfig
    • Factory conditionally creates native or HTTP writer based on config
  • Processor updates:

    • eap_items.rs - Refactored to remove #[serde(flatten)] (incompatible with Row derive)
    • Add clickhouse::Row derive to EAPItem struct

How to Enable

To use the native RowBinary client for a processor:

  1. Add clickhouse::Row derive to the row struct
  2. Change from InsertBatch::from_rows() to InsertBatch::from_rows_with_typed()
  3. Set use_native_client: true in the storage's ClickHouse config

Test plan

  • All 97 tests pass
  • Build succeeds with no errors
  • Type-aware merge tests verify batch accumulation works correctly

🤖 Generated with Claude Code

…pport

Add infrastructure for using the official clickhouse crate with RowBinary
format for improved insert performance:

- Add clickhouse crate dependency (v0.13) with uuid and native-tls features
- Create NativeClickhouseClient wrapper with retry logic
- Add clickhouse::Row derive to processor row structs:
  - outcomes, profile_chunks, profiles, functions
  - release_health_metrics, replays, errors, eap_items
- Refactor profiles and eap_items to remove #[serde(flatten)] which is
  incompatible with Row derive

The native client is not yet wired up for actual use - this prepares the
structs and client infrastructure for a future migration from JSONEachRow
to RowBinary format.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@phacops phacops requested review from a team as code owners January 30, 2026 23:08
@phacops phacops marked this pull request as draft January 30, 2026 23:19
phacops and others added 2 commits January 30, 2026 15:19
… flatten

Complete the migration of remaining processors that used #[serde(flatten)]:

- generic_metrics.rs: Use define_metric_row! macro to inline CommonMetricFields
  into CountersRawRow, SetsRawRow, DistributionsRawRow, and GaugesRawRow.
  Add clickhouse::Row derive to all metric row types.

- querylog.rs: Inline fields from Request, Timing, and QueryList directly
  into QuerylogMessage. Add clickhouse::Row derive.

All processors now have Row derive and are ready for RowBinary format.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…Binary support

Add the infrastructure to support native ClickHouse client with RowBinary format
for improved insert performance. Key changes:

- Add `InsertableRows` trait for type-erased row insertion with type-aware merging
- Add `TypedRows<T>` wrapper implementing `InsertableRows` for typed row storage
- Add `typed_rows` field to `InsertBatch` and `BytesInsertBatch` for dual-path support
- Create `NativeClickhouseWriterStep` that uses native client when typed rows available
- Add `use_native_client` config option to enable native client per storage
- Factory conditionally selects HTTP/JSON or native/RowBinary writer based on config

The native client path is opt-in via config and falls back to HTTP/JSON when
typed rows are not available, ensuring backward compatibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@phacops phacops changed the title ref(rust_snuba): Add clickhouse crate and Row derive for RowBinary support feat(rust_snuba): Add native ClickHouse client with RowBinary support Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants