Skip to content

Feature/ducklake crude operations and data preview#197

Open
Nuri1977 wants to merge 26 commits intodevfrom
feature/ducklake-crud-operations
Open

Feature/ducklake crude operations and data preview#197
Nuri1977 wants to merge 26 commits intodevfrom
feature/ducklake-crud-operations

Conversation

@Nuri1977
Copy link
Contributor

@Nuri1977 Nuri1977 commented Feb 6, 2026

Summary by CodeRabbit

  • New Features

    • Table schema & data operations (rename, add/drop/rename/alter columns, set partitioning, update/delete/upsert rows, delete/restore snapshots).
    • New table-level UI tabs: Data, Schema, Statistics, Files, Partitions, History, and Tags with in-place actions, export, and snapshot restore.
    • Import wizard: optional partitioning step.
    • File dialogs for import/export workflows.
  • Refactor

    • Table details UI split into focused tab components.
  • Helpers

    • Added formatting utilities for bytes, numbers, and safe string rendering.

- Implement DROP TABLE deletion for DuckDB/SQLite/PostgreSQL adapters
- Finish DuckLakeService.deleteTable with validation + existence checks
- Add DataLake tables UI delete action with confirmation dialog
- Reuse existing React Query mutation/toast and invalidate tables cache
…PC channels

- Implement CatalogAdapter updateRows/deleteRows/upsertRows in duckdb/sqlite/postgresql adapters
- Return {rowsAffected} for DML operations
- Ensure IPC channel typing includes ducklake:table:updateRows/deleteRows/upsertRows
- Add addColumn/dropColumn to catalog adapter contract and implement for duckdb/sqlite/postgresql
- Wire through DuckLakeService, IPC channels, renderer service, and React Query hooks
- Add Schema tab UI in DataLakeTableDetails (add column dialog + drop column confirmation)
- Prevent dropping partition columns; default value treated as raw SQL expression
- Add CatalogAdapter.renameColumn and implement in duckdb/sqlite/postgresql adapters
- Add DuckLakeService.renameColumn with validation and partition-column guard
- Wire IPC channel ducklake:table:renameColumn and renderer service method
- Add React Query mutation useRenameDuckLakeColumn with cache invalidations
- Add Schema-tab per-column Rename action + dialog in DataLakeTableDetails
- extend CatalogAdapter with alterColumnType and implement for duckdb/sqlite/postgresql
- add DuckLakeService.alterColumnType with table/column validation and partition-column guard
- wire IPC channel ducklake:table:alterColumnType + typed DuckLakeChannels update
- add renderer service + React Query mutation with cache invalidations
- add Schema tab UI action + dialog; fix related lint formatting
- add Data tab to DataLakeTableDetails with dialogs for UPDATE/DELETE/UPSERT SQL
- wire dialogs to existing React Query mutations (useUpdateDuckLakeRows/useDeleteDuckLakeRows/useUpsertDuckLakeRows)
- disable actions while mutations are running
- fix tab index ordering after inserting Data tab
- add CatalogAdapter.setPartitionedBy and implement for duckdb/sqlite/postgresql
- add DuckLakeService.setPartitionedBy with table/column validation
- wire IPC channel ducklake:table:setPartitionedBy + typed channel updates
- add renderer service + React Query mutation hook with cache invalidations
- extend Import Data wizard with Partitioning step and apply partitioning after import
- Add DuckLake CDC types (DuckLakeTableChange)
- Implement queryTableChanges in adapter + service and expose via IPC
- Add renderer service + React Query hook for table changes
- Add CDC UI (snapshot range selects + changes table) to table details
- Fix TS/ESLint issues introduced during integration
…complete, utilizing a newly defined list of DuckLake supported types.
- Extracted specific tab logic from [DataLakeTableDetails.tsx] into individual child components (Overview, Schema, Data Rows, Statistics, Data Files, Partitions, History, Tags) located in `src/renderer/components/dataLake/tableDetails/`.
- Moved utility functions (`formatBytes`, `formatNumber`, `safeToString`) to the global `src/renderer/helpers/utils.ts` to promote reuse and reduce duplication.
- Simplified `DataLakeTableDetails` to act as a container component, improving maintainability and readability.
- Cleaned up imports and removed the temporary local `utils.ts` file.
@Nuri1977 Nuri1977 self-assigned this Feb 6, 2026
@Nuri1977 Nuri1977 added the enhancement New feature or request label Feb 6, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

Adds end-to-end DuckLake table/column/row mutation and snapshot restore capabilities: new renderer UI/tabs and hooks, new IPC channels and main-process handlers, extended DuckLakeService and adapter implementations (DuckDB/Postgres/SQLite), and supporting types/utilities.

Changes

Cohort / File(s) Summary
IPC Handlers & Utils
src/main/ipcHandlers/duckLake.ipcHandlers.ts, src/main/ipcHandlers/utils.ipcHandlers.ts
Registered new ducklake:table:* IPC handlers (rename, add/drop/rename/alter column, setPartitionedBy, update/delete/upsert rows) and dialog handlers (dialog:showOpenDialog, dialog:showSaveDialog) that forward to service/Electron APIs.
Main Services
src/main/services/duckLake.service.ts, src/renderer/services/duckLake.service.ts
Added DuckLakeService methods exposed via IPC and renderer wrappers (renameTable, add/drop/rename/alter column, setPartitionedBy, update/delete/upsert rows, deleteTable, restoreSnapshot) with validation, connectivity checks, and adapter delegation.
Adapter Base & Implementations
src/main/services/duckLake/adapters/base.adapter.ts, src/main/services/duckLake/adapters/duckdb.adapter.ts, .../postgresql.adapter.ts, .../sqlite.adapter.ts
Expanded CatalogAdapter abstract contract with table/column/row mutation APIs and implemented them per adapter (DDL/DML assembly, escaping/sanitization, rowsAffected reporting, restore/delete flows, snapshot/time-travel and optional metadata handling). DuckDB adapter also improved executeQuery pagination/total counting.
Renderer Controllers / Hooks
src/renderer/controllers/duckLake.controller.ts
Added React Query mutation hooks for all new operations (add/drop/rename/alter column, setPartitionedBy, renameTable, update/delete/upsert rows, restoreSnapshot) and a tableChanges query key; hooks invalidate relevant cache keys and show toasts.
Table Details UI & Tabs
src/renderer/components/dataLake/DataLakeTableDetails.tsx, src/renderer/components/dataLake/tableDetails/*
Refactored table details into per-tab components and added Overview, Schema, DataRows, Statistics, DataFiles, Partitions, History, and Tags tabs. DataRows implements update/delete/upsert dialogs, export, pagination and live SQL previews.
Tables List & Import Flow
src/renderer/components/dataLake/DataLakeTables.tsx, src/renderer/components/dataLake/DataLakeTablesView.tsx, src/renderer/components/dataLake/DataLakeTableImportWizard.tsx
Added Delete/Rename dialogs in tables list; import wizard gained Partitioning step and onImport signature now accepts optional partitionColumns; DataLakeTablesView triggers set-partition mutation post-import when provided. Removed obsolete preview/query props.
Helpers & Config
src/renderer/helpers/utils.ts, src/renderer/config/ducklake.ts
Added formatBytes, formatNumber, safeToString utilities and exported DUCKLAKE_SUPPORTED_COLUMN_TYPES.
Types / IPC Signatures
src/types/duckLake.ts, src/types/ipc.ts
Added DuckLakeChangeOperation, DuckLakeTableChange, snapshot author/commit fields; extended IPC channel union types with new ducklake:table:* channels and dialog channels.
Small UI Adjustments
src/renderer/components/dataLake/DataLakeInstanceDetails.tsx, src/renderer/screens/dataLake/index.tsx
Snapshot listings now include Author and Message; removed no-op preview/query handlers from screens.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant UI as Renderer (UI)
  participant Rsvc as Renderer DuckLakeService
  participant IPC as Electron IPC
  participant MainHandler as Main IPC Handler
  participant MSvc as Main DuckLakeService
  participant Adapter as CatalogAdapter
  participant DB as Database

  UI->>Rsvc: trigger mutation (e.g. updateRows)
  Rsvc->>IPC: invoke 'ducklake:table:updateRows'(instanceId, table, query)
  IPC->>MainHandler: deliver channel call
  MainHandler->>MSvc: DuckLakeService.updateRows(...)
  MSvc->>Adapter: adapter.updateRows(table, updateQuery)
  Adapter->>DB: execute SQL (UPDATE / CTAS / RETURNING)
  DB-->>Adapter: result / rows affected
  Adapter-->>MSvc: { rowsAffected }
  MSvc-->>MainHandler: return result
  MainHandler-->>IPC: IPC response
  IPC-->>Rsvc: resolve promise { rowsAffected }
  Rsvc-->>UI: onSuccess -> invalidate queries / refresh UI
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • nbesimi
  • jasir99

Poem

🐰
I nibble on schemas, tidy each row,
Rename a column, watch partitions grow,
From UI to adapter my little hops call,
Mutations parade—one rabbit, one call,
Cheer for DuckLake: fresh changes, hop hooray!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.54% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feature/ducklake crude operations and data preview' is partially related to the changeset. It mentions 'crude operations' but the PR implements CRUD operations (Create, Read, Update, Delete), not crude operations. Additionally, while the title references 'data preview,' the actual changes encompass far more: extensive schema mutations (add/drop/rename columns, alter types, partitioning), table operations (rename/delete tables, restore snapshots), and comprehensive row-level data operations (update/delete/upsert rows). The 'data preview' aspect is minor compared to the full scope of CRUD capabilities being added across multiple service layers, adapters, IPC handlers, and UI components. Consider using a more accurate title such as 'Add DuckLake CRUD operations (schema, table, row mutations) and data preview UI' to clearly convey that this PR adds complete CRUD functionality across schema, table, and row levels along with data preview capabilities.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/ducklake-crud-operations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Fix all issues with AI agents
In `@src/main/ipcHandlers/duckLake.ipcHandlers.ts`:
- Around line 182-216: The ipc handlers pass raw SQL strings to
DuckLakeService.updateRows/deleteRows/upsertRows without protection against
statement chaining; add validation inside those service methods (or a shared
validator called by them) to trim the query, ensure it matches the expected
single-statement prefix (case-insensitive), reject any query containing a
semicolon (or any additional SQL after the first statement) and/or explicitly
check there is only one statement token, and optionally migrate these methods to
parameterized execution to eliminate chaining entirely; update
DuckLakeService.updateRows, DuckLakeService.deleteRows and
DuckLakeService.upsertRows to perform this validation before executing the SQL
and return a clear error when rejected.

In `@src/main/services/duckLake.service.ts`:
- Around line 53-101: Add a strict format check for SQL type and default value
strings to prevent injection: in addColumn (and similarly in alterColumnType for
newType) validate columnType (and newType) against an allowlist regex such as
/^[A-Za-z_][A-Za-z0-9_() ,]*$/ after trimming and throw
DuckLakeError.validation('Invalid column type format') if it fails; also
validate defaultValue (when provided) with a conservative pattern appropriate
for literals (or whitelist simple numeric/quoted literal forms) and throw
DuckLakeError.validation('Invalid default value format') if it fails; place
these checks right after the existing non-empty checks in addColumn and in
alterColumnType before calling adapter.addColumn/alterColumnType.
- Around line 406-414: The validation in updateRows incorrectly allows CREATE
statements; change the check on normalizedQuery inside updateRows to only permit
UPDATE statements (remove the CREATE branch) so that any non-UPDATE query throws
DuckLakeError.validation('Query must be an UPDATE statement'); keep using the
existing normalizedQuery variable and the same error helper
(DuckLakeError.validation) and adjust the error message accordingly.

In `@src/main/services/duckLake/adapters/duckdb.adapter.ts`:
- Around line 779-817: The pagination count logic can both mis-handle zero
results and accidentally re-execute DML; modify the block that builds countQuery
so it only runs when the base SQL is a read/select statement (e.g., test the
original query/request.sql with a regex for /^\s*SELECT\b/i and skip counting
for INSERT|UPDATE|DELETE|MERGE|CREATE|ALTER|DROP), and change the falsy-zero
extraction to use nullish coalescing when reading countRow (replace
countRow.total || Object.values(countRow)[0] with countRow.total ??
Object.values(countRow)[0]). Ensure you still set totalRows from
Number(countVal) and keep the existing try/catch logging around the counting
code (refer to variables countQuery, countResult, countRows, countRow,
totalRows, and request.limit/request.offset).
- Around line 265-351: The affected-row counting in updateRows, deleteRows, and
upsertRows is unreliable because it uses result.getRows() (requires RETURNING);
replace that logic by executing a follow-up "SELECT changes() AS changes" on the
same DuckDB connection (this.connectionInfo.connection) after the initial run,
parse the first row's changes value into rowsAffected (fallback to 0 if
missing), and return that value; update the functions updateRows, deleteRows,
and upsertRows to perform this SELECT changes() query instead of trying to read
result.getRows().

In `@src/main/services/duckLake/adapters/postgresql.adapter.ts`:
- Around line 190-214: The ALTER column DDL in alterColumnType interpolates
newType directly into SQL (same risk exists for columnType in addColumn); add a
validation step inside alterColumnType (and reuse for addColumn) that checks
newType against an allowlist or strict regex of permitted PostgreSQL type tokens
(e.g., ^(SMALLINT|INTEGER|BIGINT|BOOLEAN|TEXT|DATE|TIMESTAMP|REAL|DOUBLE
PRECISION|VARCHAR\(\d+\)|CHAR\(\d+\))$ with case-insensitive matching),
normalize/canonicalize the token (uppercase) and reject anything that doesn't
match by throwing a clear error before running connection.run, so only
validated, safe type strings are ever interpolated into the ALTER TABLE
statement.
- Around line 109-139: The addColumn implementation interpolates defaultValue
directly into SQL, enabling injection; in addColumn (and the same method across
other adapters) validate and sanitize defaultValue before building
defaultClause: reject or strip any semicolons or SQL comment markers, allow only
simple literal patterns (e.g. numeric, boolean) or treat as a string by escaping
single quotes and surrounding with single quotes, and then use that sanitized
token when constructing the ALTER TABLE ... ADD COLUMN statement via
connectionInfo.connection.run; alternatively, enforce that callers pass
null/undefined for complex defaults and centralize validation so defaultClause
is only set to a safe, escaped literal.

In `@src/renderer/components/dataLake/DataLakeTablesView.tsx`:
- Around line 45-60: The partition mutation call using
setPartitionedByMutation.mutate lacks an onError handler so when it fails the
import wizard stays open; update the mutate call for setPartitionedByMutation
(the block calling setPartitionedByMutation.mutate with { instanceId, tableName,
columnNames: partitionColumns }) to include an onError callback that at minimum
calls setImportWizardOpen(false) (and also calls invalidateTables(instanceId) to
keep state consistent) so the wizard is closed and tables refreshed even when
the mutation errors.

In `@src/renderer/components/dataLake/tableDetails/TableDataFilesTab.tsx`:
- Around line 22-24: The prop TableDataFilesTabProps currently types
tableDetails as any and accesses tableDetails.dataFiles without guards; change
the prop type to a proper interface (e.g., TableDataFilesTabProps {
tableDetails: { dataFiles?: DuckLakeDataFileInfo[] } }) and update the
TableDataFilesTab component to use a safe access pattern (optional chaining or
defaulting to an empty array) when checking length and mapping (references:
TableDataFilesTabProps, tableDetails, dataFiles, TableDataFilesTab). Ensure all
occurrences replace raw .dataFiles usage with tableDetails.dataFiles ?? [] or
tableDetails.dataFiles?.length checks so no runtime .length on undefined occurs.

In `@src/renderer/components/dataLake/tableDetails/TableDataRowsTab.tsx`:
- Around line 242-255: Remove the useEffect that watches
updateRowsMutation.isSuccess / deleteRowsMutation.isSuccess /
upsertRowsMutation.isSuccess and calls fetchData(), and keep the explicit
fetchData() calls in the onSuccess handlers (handleConfirmUpdateRows,
handleConfirmDeleteRows, handleConfirmUpsertRows) so each mutation only triggers
one fetch; also reset each mutation's status after handling success (e.g., call
updateRowsMutation.reset(), deleteRowsMutation.reset(),
upsertRowsMutation.reset() in the onSuccess flows) to prevent stale isSuccess
from causing future redundant fetches when fetchData reference changes.
- Line 139: The export SQL builds identifiers using `"${tableName}"` without
escaping double-quotes inside tableName, which breaks SQL for names containing
quotes; update the construction in TableDataRowsTab (the exportQuery variable
and the other similar instances flagged) to escape double quotes in tableName by
replacing every " with "" before embedding (e.g., use tableName.replace(/"/g,
'""') when interpolating), so the identifier becomes a safely quoted string;
apply this same change to the other occurrences that build SELECT ... FROM
"${tableName}" in this file.
- Around line 280-348: The generated CTAS and delete query construction is
vulnerable to SQL injection because numeric/boolean inputs for updateFields and
whereConditions are interpolated raw when needsQuotes(...) is false; update the
React.useEffect logic that builds whereClauses and caseStatements to validate
and sanitize non-quoted values: for each updateFields[].value and
whereConditions[].value for columns where needsQuotes(col) === false, confirm
they parse as a valid number/boolean (e.g., Number(value) isFinite or
/^[+-]?\d+(\.\d+)?$/.test(value) for numbers, and strict match for true/false),
and if invalid either reject/skip the field (set generated query to empty and
surface an error) or coerce to a safe NULL/validated literal before
interpolation; also add the same validation/sanitization to the delete query
builder code path so both CREATE OR REPLACE TABLE generation (caseStatements,
updatedCols, whereClauses) and delete query builders never interpolate
unvalidated raw inputs.

In `@src/renderer/components/dataLake/tableDetails/TableOverviewTab.tsx`:
- Around line 96-133: The JSX in TableOverviewTab reads
tableDetails.stats.recordCount, fileSizeBytes and nextRowId directly which can
throw if stats is null/undefined; update the TableOverviewTab render to use safe
access and defaults (e.g., tableDetails.stats?.recordCount ?? 0 for
formatNumber, tableDetails.stats?.fileSizeBytes ?? 0 for formatBytes, and
tableDetails.stats?.nextRowId ?? 0) and also defensively handle arrays like
tableDetails.columns?.length ?? 0 and tableDetails.dataFiles?.length ?? 0 so the
component never dereferences undefined.

In `@src/renderer/components/dataLake/tableDetails/TablePartitionsTab.tsx`:
- Around line 34-70: tableDetails.partitionInfo may be truthy while its arrays
are undefined, causing .length and .map to crash; update all usages inside
TablePartitionsTab (references: tableDetails.partitionInfo,
partitionInfo.columns, partitionInfo.filePartitionValues, and
DuckLakePartitionColumn mapping) to guard with optional chaining or default
empty arrays (e.g., use partitionInfo?.columns ?? [] for .map and
(partitionInfo?.columns ?? []).length for counts) so safeToString and the row
render logic never receive undefined.

In `@src/renderer/helpers/utils.ts`:
- Around line 423-429: formatBytes currently passes negative numbers to Math.log
and can index past the sizes array for very large values; change formatBytes to
record the original sign, work with absBytes = Math.abs(bytes) (handle
non-finite/NaN by returning '0 Bytes' or similar), compute i =
Math.floor(Math.log(absBytes) / Math.log(k)) but clamp i to sizes.length - 1,
then format using sizes[i] and reapply the negative sign prefix if the original
bytes was negative; ensure all references are to formatBytes, sizes, i, k and
guard against division by zero when absBytes === 0.

In `@src/types/duckLake.ts`:
- Around line 544-548: The DuckLakeIpcChannels interface is missing type
signatures for eight IPC channels declared elsewhere; update the
DuckLakeIpcChannels interface (the type that currently contains
'ducklake:table:setPartitionedBy') to include the following method signatures
with matching parameter shapes: 'ducklake:table:rename' (instanceId: string,
oldName: string, newName: string) => Promise<void>, 'ducklake:table:addColumn'
(instanceId: string, tableName: string, columnDef: any) => Promise<void>,
'ducklake:table:dropColumn' (instanceId: string, tableName: string, columnName:
string) => Promise<void>, 'ducklake:table:renameColumn' (instanceId: string,
tableName: string, oldName: string, newName: string) => Promise<void>,
'ducklake:table:alterColumnType' (instanceId: string, tableName: string,
columnName: string, newType: string) => Promise<void>,
'ducklake:table:updateRows' (instanceId: string, tableName: string, filter: any,
updates: any) => Promise<number>, 'ducklake:table:deleteRows' (instanceId:
string, tableName: string, filter: any) => Promise<number>, and
'ducklake:table:upsertRows' (instanceId: string, tableName: string, rows: any[])
=> Promise<number> so the interface matches the IPC channel declarations.

In `@src/types/ipc.ts`:
- Around line 264-273: The union includes the channel
'ducklake:table:queryChanges' but there is no service/controller/IPC handler for
it; either remove 'ducklake:table:queryChanges' from the IPC union in the types,
or implement it by adding a service method (e.g., add queryChanges(...) to
DuckLakeService), a controller handler (e.g.,
duckLakeController.handleQueryChanges that calls DuckLakeService.queryChanges),
and wire the controller to the IPC dispatcher/registration so the
'ducklake:table:queryChanges' channel is handled consistently with other table
mutation channels like updateRows/deleteRows/upsertRows.
🧹 Nitpick comments (18)
src/renderer/config/ducklake.ts (1)

13-29: Hardcoded parameterized types limit user flexibility.

decimal(18,3), list(INTEGER), struct(name VARCHAR), and map(VARCHAR, INTEGER) are single fixed instances rather than templates. If this drives a type-selection dropdown, users can't choose, e.g., decimal(10,2) or list(VARCHAR). Consider either allowing free-text input alongside the dropdown, or separating base types (decimal, list, struct, map) and prompting for parameters in a follow-up input.

src/renderer/components/dataLake/DataLakeTableImportWizard.tsx (1)

114-141: Partition-column parsing is duplicated.

The .split(',').map(…).filter(…) logic appears identically in handleImport (Lines 132-135) and renderReview (Lines 317-320). Extract it once (e.g., a useMemo or a local helper) to keep them in sync and avoid subtle divergence.

♻️ Suggested refactor

Add a helper or memo near the top of the component:

+ const parsedPartitionColumns = useMemo(
+   () => partitionColumnsText.split(',').map((c) => c.trim()).filter((c) => c.length > 0),
+   [partitionColumnsText],
+ );

Then reference parsedPartitionColumns in both handleImport and renderReview.

src/main/ipcHandlers/utils.ipcHandlers.ts (2)

42-58: Consider passing the parent BrowserWindow to make dialogs modal.

dialog.showOpenDialog(options) and dialog.showSaveDialog(options) without a BrowserWindow argument create non-modal dialogs that can appear behind the application window on some platforms. Passing the focused window ensures correct modal behavior:

-    async (_event, options: OpenDialogOptions) => {
-      const result = await dialog.showOpenDialog(options);
+    async (event, options: OpenDialogOptions) => {
+      const win = BrowserWindow.fromWebContents(event.sender);
+      const result = win
+        ? await dialog.showOpenDialog(win, options)
+        : await dialog.showOpenDialog(options);

Same pattern for showSaveDialog.


43-49: IPC dialog options are passed through without sanitization.

The renderer sends arbitrary OpenDialogOptions/SaveDialogOptions to the main process. While Electron's dialog APIs are inherently safe (native OS dialogs), consider allowlisting properties (e.g., filters, defaultPath, properties) to follow the principle of least privilege for IPC boundaries.

Also applies to: 52-58

src/renderer/components/dataLake/tableDetails/TableOverviewTab.tsx (1)

18-20: Use DuckLakeTableDetails instead of any for tableDetails prop.

All tab components (TableOverviewTab, TableTagsTab, TableSchemaTab, TableHistoryTab, etc.) type tableDetails as any, losing compile-time safety. The DuckLakeTableDetails interface already exists in src/types/duckLake.ts and matches the shape being accessed. This would catch property access errors at build time.

♻️ Proposed fix
+import { DuckLakeTableDetails } from '../../../../types/duckLake';
+
 interface TableOverviewTabProps {
-  tableDetails: any;
+  tableDetails: DuckLakeTableDetails;
 }
src/renderer/components/dataLake/tableDetails/TableTagsTab.tsx (1)

74-95: Column Tags table omits Snapshot Range, inconsistent with Table Tags.

The Table Tags section displays beginSnapshot/endSnapshot in a "Snapshot Range" column, but the Column Tags section omits it even though DuckLakeColumnTag has both fields. Consider adding the snapshot range column to Column Tags for consistency.

♻️ Proposed fix — add Snapshot Range to Column Tags
                  <TableHead>
                    <TableRow>
                      <TableCell>Column</TableCell>
                      <TableCell>Key</TableCell>
                      <TableCell>Value</TableCell>
+                     <TableCell>Snapshot Range</TableCell>
                    </TableRow>
                  </TableHead>
                  <TableBody>
                    {tableDetails.columnTags.map(
                      (tag: DuckLakeColumnTag, idx: number) => (
                        <TableRow key={idx}>
                          <TableCell>
                            <strong>{tag.columnName}</strong>
                          </TableCell>
                          <TableCell>{tag.key}</TableCell>
                          <TableCell>{tag.value}</TableCell>
+                         <TableCell>
+                           {safeToString(tag.beginSnapshot)}
+                           {tag.endSnapshot
+                             ? ` - ${safeToString(tag.endSnapshot)}`
+                             : ' (current)'}
+                         </TableCell>
                        </TableRow>
                      ),
                    )}
src/renderer/components/dataLake/tableDetails/TableSchemaTab.tsx (2)

76-83: Memoize the partition column ID set to avoid re-creating it on every row render.

isPartitionColumnId builds a new Set on each invocation. Since it's called once per column row in the render loop, the Set is needlessly rebuilt O(N) times. Extract it with useMemo.

♻️ Proposed fix
+import { useMemo } from 'react';
+
+  const partitionColumnIds = useMemo(() => {
+    return new Set<number>(
+      (tableDetails?.partitionInfo?.columns || []).map((c: any) =>
+        Number(c.columnId),
+      ),
+    );
+  }, [tableDetails?.partitionInfo?.columns]);
+
-  const isPartitionColumnId = (columnId: number) => {
-    const ids = new Set<number>(
-      (tableDetails?.partitionInfo?.columns || []).map((c: any) =>
-        Number(c.columnId),
-      ),
-    );
-    return ids.has(Number(columnId));
-  };
+  const isPartitionColumnId = (columnId: number) =>
+    partitionColumnIds.has(Number(columnId));

92-108: Dialog closes before mutation result — mutation errors won't re-open the dialog.

handleConfirmAddColumn (and the other confirm handlers) close the dialog and reset state immediately after calling mutate(). If the mutation fails, the user sees a toast but must re-open the dialog and re-enter all fields. Consider keeping the dialog open until onSuccess/onSettled, or use mutateAsync with try/catch.

This applies to all four confirm handlers (add, drop, rename, alter type).

src/renderer/components/dataLake/tableDetails/TablePartitionsTab.tsx (2)

82-94: Prefer a composite key over array index for file partition values.

DuckLakeFilePartitionValue has dataFileId and partitionKeyIndex which together form a natural unique key. Using the array index (idx) can cause React reconciliation issues if the list is ever reordered or filtered.

♻️ Suggested key
-                    (val: DuckLakeFilePartitionValue, idx: number) => (
-                      <TableRow key={idx}>
+                    (val: DuckLakeFilePartitionValue) => (
+                      <TableRow key={`${val.dataFileId}-${val.partitionKeyIndex}`}>

21-23: Consider stronger typing for tableDetails.

tableDetails: any loses all type safety. If there's a shared DuckLakeTableDetails interface (or similar) used by sibling tab components, using it here would catch shape mismatches at compile time.

src/renderer/components/dataLake/DataLakeTables.tsx (2)

102-115: Delete and rename mutations fire-and-forget: dialog closes before result.

handleConfirmDelete (and similarly handleConfirmRename) closes the dialog and clears state immediately after calling mutate, not in the mutation's onSuccess. If the mutation is slow or fails, the user has already lost the dialog context. Consider moving the dialog-close logic into the onSuccess/onError callbacks of the .mutate() call.

♻️ Example for delete
  const handleConfirmDelete = () => {
    if (!tableToDelete) {
      setDeleteDialogOpen(false);
      return;
    }

    deleteTableMutation.mutate(
      {
        instanceId: tableToDelete.instanceId,
        tableName: tableToDelete.name,
      },
+     {
+       onSettled: () => {
+         setDeleteDialogOpen(false);
+         setTableToDelete(null);
+       },
+     },
    );
-
-   setDeleteDialogOpen(false);
-   setTableToDelete(null);
  };

123-138: Rename allows submitting the same name and doesn't trim whitespace.

The disabled check on Line 385 blocks empty/whitespace-only names, but newTableName is submitted untrimmed (Line 132). Also, renaming to the same name isn't blocked—this will hit the backend and return an error. A quick client-side guard would improve UX.

♻️ Proposed fix
  const handleConfirmRename = () => {
    if (!tableToRename) {
      setRenameDialogOpen(false);
      return;
    }

+   const trimmedName = newTableName.trim();
+   if (trimmedName === tableToRename.name) return;
+
    renameTableMutation.mutate({
      instanceId: tableToRename.instanceId,
      oldName: tableToRename.name,
-     newName: newTableName,
+     newName: trimmedName,
    });
src/renderer/components/dataLake/tableDetails/TableDataRowsTab.tsx (1)

526-791: Significant duplication between the Update and Delete dialog WHERE-condition builders.

The WHERE section UI (column select, operator select, value input, add/remove row) is nearly identical between the Update dialog (Lines 638–752) and Delete dialog (Lines 803–917). Extracting a shared WhereConditionsBuilder component would eliminate ~120 lines of duplication and keep the two dialogs in sync for future changes.

src/renderer/controllers/duckLake.controller.ts (1)

54-396: Extract shared cache invalidation logic to reduce duplication.

Nine of the new hooks contain the identical four-line invalidation block for tableDetails, snapshots, table, and tables. A small helper would DRY this up and ensure consistency if keys are added/changed later.

♻️ Suggested helper
function invalidateTableCaches(
  queryClient: ReturnType<typeof useQueryClient>,
  instanceId: string,
  tableName: string,
) {
  queryClient.invalidateQueries(duckLakeKeys.tableDetails(instanceId, tableName));
  queryClient.invalidateQueries(duckLakeKeys.snapshots(instanceId, tableName));
  queryClient.invalidateQueries(duckLakeKeys.table(instanceId, tableName));
  queryClient.invalidateQueries(duckLakeKeys.tables(instanceId));
}

Then in each hook's onSuccess:

-      queryClient.invalidateQueries(duckLakeKeys.tableDetails(instanceId, tableName));
-      queryClient.invalidateQueries(duckLakeKeys.snapshots(instanceId, tableName));
-      queryClient.invalidateQueries(duckLakeKeys.table(instanceId, tableName));
-      queryClient.invalidateQueries(duckLakeKeys.tables(instanceId));
+      invalidateTableCaches(queryClient, instanceId, tableName);
src/main/services/duckLake/adapters/postgresql.adapter.ts (1)

267-349: updateRows, deleteRows, and upsertRows are nearly identical — extract shared logic.

All three methods share the same pattern: connection check → run query → SELECT changes() → parse hugeint → return rowsAffected. Only the error message differs. Consider extracting a private helper.

Proposed helper extraction
+  private async executeAndCountChanges(
+    query: string,
+    operationName: string,
+  ): Promise<{ rowsAffected: number }> {
+    if (!this.connectionInfo) {
+      throw new Error('No active connection');
+    }
+    try {
+      await this.connectionInfo.connection.run(query);
+      const changesResult = await this.connectionInfo.connection.run(
+        'SELECT changes() as changes',
+      );
+      const rows = await changesResult.getRows();
+      const value = rows?.[0]?.[0] ?? 0;
+      const numeric =
+        typeof value === 'object' && value?.hugeint !== undefined
+          ? Number(value.hugeint)
+          : Number(value);
+      return { rowsAffected: Number.isFinite(numeric) ? numeric : 0 };
+    } catch (error) {
+      console.error(`Failed to ${operationName} rows:`, error);
+      throw error;
+    }
+  }

Then each method becomes a one-liner delegating to this helper. This could also be lifted to the base adapter since SQLite uses the same pattern.

src/main/services/duckLake/adapters/sqlite.adapter.ts (1)

88-348: All mutation methods are duplicated verbatim across DuckDB, PostgreSQL, and SQLite adapters.

Since all three adapters execute DuckDB SQL through the same DuckDB engine (only the attach/connect path differs), the mutation methods (restoreSnapshot, addColumn, dropColumn, renameColumn, alterColumnType, setPartitionedBy, renameTable, deleteTable, updateRows, deleteRows, upsertRows) are identical across adapters. Only the error log prefix changes (e.g., "SQLite" vs "PostgreSQL").

Consider implementing these once in the base CatalogAdapter class as concrete (non-abstract) methods, since they all share the same this.connectionInfo.connection.run(...) pattern.

Sketch: lift common mutations to the base class
// In base.adapter.ts — replace the abstract declarations with concrete implementations:

- abstract deleteTable(tableName: string): Promise<void>;
+ async deleteTable(tableName: string): Promise<void> {
+   if (!this.connectionInfo) throw new Error('No active connection');
+   const escaped = tableName.replace(/"/g, '""');
+   await this.connectionInfo.connection.run(`DROP TABLE IF EXISTS "${escaped}"`);
+ }

// Same for addColumn, dropColumn, renameColumn, alterColumnType,
// setPartitionedBy, renameTable, restoreSnapshot, and the row mutation methods.

This eliminates ~700+ lines of near-identical code and ensures behavioral consistency across adapters. Any adapter needing custom behavior can override the method.

src/main/services/duckLake/adapters/base.adapter.ts (2)

542-568: mapResultRow — currently unused; adapters perform domain-specific row mapping elsewhere.

This utility centralizes array-to-object row normalization, but it is not called anywhere in the codebase. The adapters (duckdb, postgresql, sqlite) inline equivalent logic extensively—checking Array.isArray(row) and accessing row[0], row[1], etc. to build typed domain objects (SnapshotMetadata, ColumnStatistics, DataFileMetadata, etc.).

Consider adding a TODO comment if this is intended for future use, or refactor adapters to use mapResultRow if its generic column-mapping behavior aligns with adapter needs.


517-540: getMetadataDatabaseName — apply to adapters to eliminate duplication.

The extracted helper method is defined but not used. All three adapter implementations (DuckDB, PostgreSQL, SQLite) inline the same metadata database lookup logic across multiple methods instead of calling this helper. This creates ~10+ duplicate query patterns throughout the codebase.

Comment on lines +265 to +351
async updateRows(
_tableName: string,
updateQuery: string,
): Promise<{ rowsAffected: number }> {
try {
if (!this.connectionInfo) {
throw new Error('No active connection');
}

const result = await this.connectionInfo.connection.run(updateQuery);

// Try to get rows to determine affected count (requires RETURNING clause in query)
let rowsAffected = 0;
try {
const rows = await result.getRows();
if (rows && rows.length > 0) {
rowsAffected = rows.length;
}
} catch (e) {
// Ignore errors if result doesn't support getRows
}

return { rowsAffected };
} catch (error) {
// eslint-disable-next-line no-console
console.error('Failed to update rows:', error);
throw error;
}
}

async deleteRows(
_tableName: string,
deleteQuery: string,
): Promise<{ rowsAffected: number }> {
try {
if (!this.connectionInfo) {
throw new Error('No active connection');
}

const result = await this.connectionInfo.connection.run(deleteQuery);

let rowsAffected = 0;
try {
const rows = await result.getRows();
if (rows && rows.length > 0) {
rowsAffected = rows.length;
}
} catch (e) {
// Ignore
}

return { rowsAffected };
} catch (error) {
// eslint-disable-next-line no-console
console.error('Failed to delete rows:', error);
throw error;
}
}

async upsertRows(
_tableName: string,
upsertQuery: string,
): Promise<{ rowsAffected: number }> {
try {
if (!this.connectionInfo) {
throw new Error('No active connection');
}

const result = await this.connectionInfo.connection.run(upsertQuery);

let rowsAffected = 0;
try {
const rows = await result.getRows();
if (rows && rows.length > 0) {
rowsAffected = rows.length;
}
} catch (e) {
// Ignore
}

return { rowsAffected };
} catch (error) {
// eslint-disable-next-line no-console
console.error('Failed to upsert rows:', error);
throw error;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

DuckDB rowsAffected counting is unreliable compared to PostgreSQL/SQLite adapters.

The DuckDB adapter counts affected rows via result.getRows().length, which only works if the query includes a RETURNING clause. Without it, rowsAffected will always be 0. In contrast, the PostgreSQL and SQLite adapters use SELECT changes() which reliably returns the count regardless of the query shape.

Since all three adapters run queries through DuckDB's engine (DuckLake), SELECT changes() should work here too.

Proposed fix: use `changes()` consistently
   async updateRows(
     _tableName: string,
     updateQuery: string,
   ): Promise<{ rowsAffected: number }> {
     try {
       if (!this.connectionInfo) {
         throw new Error('No active connection');
       }
 
-      const result = await this.connectionInfo.connection.run(updateQuery);
-
-      // Try to get rows to determine affected count (requires RETURNING clause in query)
-      let rowsAffected = 0;
-      try {
-        const rows = await result.getRows();
-        if (rows && rows.length > 0) {
-          rowsAffected = rows.length;
-        }
-      } catch (e) {
-        // Ignore errors if result doesn't support getRows
-      }
-
-      return { rowsAffected };
+      await this.connectionInfo.connection.run(updateQuery);
+      const changesResult = await this.connectionInfo.connection.run(
+        'SELECT changes() as changes',
+      );
+      const rows = await changesResult.getRows();
+      const value = rows?.[0]?.[0] ?? 0;
+      const numeric =
+        typeof value === 'object' && value?.hugeint !== undefined
+          ? Number(value.hugeint)
+          : Number(value);
+      return { rowsAffected: Number.isFinite(numeric) ? numeric : 0 };
     } catch (error) {
       // eslint-disable-next-line no-console
       console.error('Failed to update rows:', error);
       throw error;
     }
   }

Apply the same pattern to deleteRows and upsertRows.

🤖 Prompt for AI Agents
In `@src/main/services/duckLake/adapters/duckdb.adapter.ts` around lines 265 -
351, The affected-row counting in updateRows, deleteRows, and upsertRows is
unreliable because it uses result.getRows() (requires RETURNING); replace that
logic by executing a follow-up "SELECT changes() AS changes" on the same DuckDB
connection (this.connectionInfo.connection) after the initial run, parse the
first row's changes value into rowsAffected (fallback to 0 if missing), and
return that value; update the functions updateRows, deleteRows, and upsertRows
to perform this SELECT changes() query instead of trying to read
result.getRows().

Comment on lines +96 to +133
<Table size="small">
<TableBody>
<TableRow>
<TableCell>
<strong>Record Count</strong>
</TableCell>
<TableCell>
{formatNumber(tableDetails.stats.recordCount)}
</TableCell>
</TableRow>
<TableRow>
<TableCell>
<strong>File Size</strong>
</TableCell>
<TableCell>
{formatBytes(tableDetails.stats.fileSizeBytes)}
</TableCell>
</TableRow>
<TableRow>
<TableCell>
<strong>Next Row ID</strong>
</TableCell>
<TableCell>
{formatNumber(tableDetails.stats.nextRowId)}
</TableCell>
</TableRow>
<TableRow>
<TableCell>
<strong>Columns</strong>
</TableCell>
<TableCell>{tableDetails.columns.length}</TableCell>
</TableRow>
<TableRow>
<TableCell>
<strong>Data Files</strong>
</TableCell>
<TableCell>{tableDetails.dataFiles.length}</TableCell>
</TableRow>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential runtime error if tableDetails.stats is missing or has unexpected shape.

formatNumber and formatBytes are called directly on tableDetails.stats.recordCount, tableDetails.stats.fileSizeBytes, and tableDetails.stats.nextRowId. If stats is ever null/undefined (e.g., newly created table with no stats yet), this will throw. Given the any typing, the compiler won't catch this.

Consider adding a guard or defaulting: tableDetails.stats?.recordCount ?? 0.

🤖 Prompt for AI Agents
In `@src/renderer/components/dataLake/tableDetails/TableOverviewTab.tsx` around
lines 96 - 133, The JSX in TableOverviewTab reads
tableDetails.stats.recordCount, fileSizeBytes and nextRowId directly which can
throw if stats is null/undefined; update the TableOverviewTab render to use safe
access and defaults (e.g., tableDetails.stats?.recordCount ?? 0 for
formatNumber, tableDetails.stats?.fileSizeBytes ?? 0 for formatBytes, and
tableDetails.stats?.nextRowId ?? 0) and also defensively handle arrays like
tableDetails.columns?.length ?? 0 and tableDetails.dataFiles?.length ?? 0 so the
component never dereferences undefined.

Comment on lines +34 to +70
{tableDetails.partitionInfo ? (
<Box>
<Typography variant="subtitle1" gutterBottom>
Partition Columns ({tableDetails.partitionInfo.columns.length})
</Typography>
<TableContainer sx={{ mb: 3 }}>
<Table>
<TableHead>
<TableRow>
<TableCell>Key Index</TableCell>
<TableCell>Column Name</TableCell>
<TableCell>Transform</TableCell>
</TableRow>
</TableHead>
<TableBody>
{tableDetails.partitionInfo.columns.map(
(col: DuckLakePartitionColumn) => (
<TableRow
key={`${col.partitionId}-${safeToString(col.partitionKeyIndex)}`}
>
<TableCell>
{safeToString(col.partitionKeyIndex)}
</TableCell>
<TableCell>
<strong>{col.columnName}</strong>
</TableCell>
<TableCell>{col.transform || 'identity'}</TableCell>
</TableRow>
),
)}
</TableBody>
</Table>
</TableContainer>

<Typography variant="subtitle1" gutterBottom>
File Partition Values (
{tableDetails.partitionInfo.filePartitionValues.length})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against missing columns or filePartitionValues when partitionInfo is truthy.

If partitionInfo exists but columns or filePartitionValues is undefined/null, accessing .length and .map will throw at runtime (Lines 37, 49, 70, 82). Consider optional chaining or default arrays.

🛡️ Proposed safeguard
-        {tableDetails.partitionInfo ? (
+        {tableDetails.partitionInfo ? (
           <Box>
             <Typography variant="subtitle1" gutterBottom>
-              Partition Columns ({tableDetails.partitionInfo.columns.length})
+              Partition Columns ({tableDetails.partitionInfo.columns?.length ?? 0})
             </Typography>
             <TableContainer sx={{ mb: 3 }}>
               <Table>
                 ...
                 <TableBody>
-                  {tableDetails.partitionInfo.columns.map(
+                  {(tableDetails.partitionInfo.columns ?? []).map(

Apply the same pattern to filePartitionValues.

🤖 Prompt for AI Agents
In `@src/renderer/components/dataLake/tableDetails/TablePartitionsTab.tsx` around
lines 34 - 70, tableDetails.partitionInfo may be truthy while its arrays are
undefined, causing .length and .map to crash; update all usages inside
TablePartitionsTab (references: tableDetails.partitionInfo,
partitionInfo.columns, partitionInfo.filePartitionValues, and
DuckLakePartitionColumn mapping) to guard with optional chaining or default
empty arrays (e.g., use partitionInfo?.columns ?? [] for .map and
(partitionInfo?.columns ?? []).length for counts) so safeToString and the row
render logic never receive undefined.

Comment on lines +423 to +429
export const formatBytes = (bytes: number) => {
if (bytes === 0) return '0 Bytes';
const k = 1024;
const sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB'];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return `${Math.round((bytes / k ** i) * 100) / 100} ${sizes[i]}`;
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

formatBytes breaks on negative or extremely large values.

Math.log(negative) returns NaN, causing the output to be "NaN undefined". Similarly, values beyond the TB range will index past the sizes array, producing undefined as the unit.

🛡️ Proposed defensive fix
 export const formatBytes = (bytes: number) => {
   if (bytes === 0) return '0 Bytes';
+  if (bytes < 0) return '-';
   const k = 1024;
   const sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB'];
-  const i = Math.floor(Math.log(bytes) / Math.log(k));
+  const i = Math.min(Math.floor(Math.log(bytes) / Math.log(k)), sizes.length - 1);
   return `${Math.round((bytes / k ** i) * 100) / 100} ${sizes[i]}`;
 };
🤖 Prompt for AI Agents
In `@src/renderer/helpers/utils.ts` around lines 423 - 429, formatBytes currently
passes negative numbers to Math.log and can index past the sizes array for very
large values; change formatBytes to record the original sign, work with absBytes
= Math.abs(bytes) (handle non-finite/NaN by returning '0 Bytes' or similar),
compute i = Math.floor(Math.log(absBytes) / Math.log(k)) but clamp i to
sizes.length - 1, then format using sizes[i] and reapply the negative sign
prefix if the original bytes was negative; ensure all references are to
formatBytes, sizes, i, k and guard against division by zero when absBytes === 0.

Comment on lines +544 to +548
'ducklake:table:setPartitionedBy': (
instanceId: string,
tableName: string,
columnNames: string[],
) => Promise<void>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if the other new ducklake:table:* IPC channels are declared in ipc.ts or elsewhere
rg -n "ducklake:table:rename\b|ducklake:table:addColumn|ducklake:table:dropColumn|ducklake:table:renameColumn|ducklake:table:alterColumnType|ducklake:table:updateRows|ducklake:table:deleteRows|ducklake:table:upsertRows" --type=ts -g '*.ts' -g '!*.ipcHandlers.ts'

Repository: rosettadb/dbt-studio

Length of output: 1139


🏁 Script executed:

# Also check duckLake.ts to see all declared channels
cat -n src/types/duckLake.ts | grep -A 50 "DuckLakeIpcChannels"

Repository: rosettadb/dbt-studio

Length of output: 2099


🏁 Script executed:

# Check ipc.ts for DuckLake channel declarations
cat -n src/types/ipc.ts | head -100

Repository: rosettadb/dbt-studio

Length of output: 3372


🏁 Script executed:

# Search for where IPC handlers are defined
fd -e ts -x grep -l "ducklake:table:" {}

Repository: rosettadb/dbt-studio

Length of output: 197


🏁 Script executed:

# View the context around lines 264-272 in ipc.ts
sed -n '250,280p' src/types/ipc.ts

Repository: rosettadb/dbt-studio

Length of output: 999


🏁 Script executed:

# Also check if there are more DuckLake channels after line 272
sed -n '260,290p' src/types/ipc.ts

Repository: rosettadb/dbt-studio

Length of output: 935


🏁 Script executed:

# View more of the DuckLakeIpcChannels interface in duckLake.ts to see all declared channels
sed -n '512,600p' src/types/duckLake.ts | head -100

Repository: rosettadb/dbt-studio

Length of output: 2796


🏁 Script executed:

# Check if there are more table management channels in the duckLake.ts interface
rg -A 100 "export interface DuckLakeIpcChannels" src/types/duckLake.ts | grep "ducklake:table"

Repository: rosettadb/dbt-studio

Length of output: 298


Add missing DuckLake IPC channel type signatures to the interface.

Eight of the nine new IPC channels are declared in src/types/ipc.ts but missing from the DuckLakeIpcChannels interface in src/types/duckLake.ts. Only setPartitionedBy has been added to the interface. The missing channels are: ducklake:table:rename, ducklake:table:addColumn, ducklake:table:dropColumn, ducklake:table:renameColumn, ducklake:table:alterColumnType, ducklake:table:updateRows, ducklake:table:deleteRows, and ducklake:table:upsertRows. Add their type signatures to the interface to keep all DuckLake channel declarations consistent in one location.

🤖 Prompt for AI Agents
In `@src/types/duckLake.ts` around lines 544 - 548, The DuckLakeIpcChannels
interface is missing type signatures for eight IPC channels declared elsewhere;
update the DuckLakeIpcChannels interface (the type that currently contains
'ducklake:table:setPartitionedBy') to include the following method signatures
with matching parameter shapes: 'ducklake:table:rename' (instanceId: string,
oldName: string, newName: string) => Promise<void>, 'ducklake:table:addColumn'
(instanceId: string, tableName: string, columnDef: any) => Promise<void>,
'ducklake:table:dropColumn' (instanceId: string, tableName: string, columnName:
string) => Promise<void>, 'ducklake:table:renameColumn' (instanceId: string,
tableName: string, oldName: string, newName: string) => Promise<void>,
'ducklake:table:alterColumnType' (instanceId: string, tableName: string,
columnName: string, newType: string) => Promise<void>,
'ducklake:table:updateRows' (instanceId: string, tableName: string, filter: any,
updates: any) => Promise<number>, 'ducklake:table:deleteRows' (instanceId:
string, tableName: string, filter: any) => Promise<number>, and
'ducklake:table:upsertRows' (instanceId: string, tableName: string, rows: any[])
=> Promise<number> so the interface matches the IPC channel declarations.

Comment on lines +264 to +273
| 'ducklake:table:rename'
| 'ducklake:table:addColumn'
| 'ducklake:table:dropColumn'
| 'ducklake:table:renameColumn'
| 'ducklake:table:alterColumnType'
| 'ducklake:table:setPartitionedBy'
| 'ducklake:table:updateRows'
| 'ducklake:table:deleteRows'
| 'ducklake:table:upsertRows'
| 'ducklake:table:queryChanges'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for all occurrences of queryChanges
echo "=== All occurrences of 'queryChanges' ==="
rg 'queryChanges' --type=ts -i -n

echo ""
echo "=== Checking src/types/ipc.ts for context ==="
rg -B5 -A5 'queryChanges' src/types/ipc.ts

echo ""
echo "=== Looking for handler registration/wiring ==="
rg 'on.*queryChanges|queryChanges.*handler|ipcMain.*queryChanges' --type=ts -n

echo ""
echo "=== Checking for service implementation files ==="
find . -name '*duckLake*service*' -o -name '*ipc*handler*' -o -name '*channel*handler*' | grep -v node_modules

Repository: rosettadb/dbt-studio

Length of output: 752


🏁 Script executed:

#!/bin/bash
echo "=== Main service file ==="
cat -n src/main/services/duckLake.service.ts | grep -i 'changes\|query' || echo "No matches found"

echo ""
echo "=== Renderer service file ==="
cat -n src/renderer/services/duckLake.service.ts | grep -i 'changes\|query' || echo "No matches found"

echo ""
echo "=== Check surrounding channels for implementation pattern ==="
echo "Checking if updateRows/deleteRows/upsertRows have handlers:"
rg 'updateRows|deleteRows|upsertRows' --type=ts -n | head -20

Repository: rosettadb/dbt-studio

Length of output: 4668


Remove ducklake:table:queryChanges from the channel union or implement its handler and service method.

This channel is declared in the type union but has no corresponding service method, controller handler, or IPC wiring anywhere in the codebase. All other table mutation channels (updateRows, deleteRows, upsertRows) have complete implementations with service methods in src/renderer/services/duckLake.service.ts and handlers in src/renderer/controllers/duckLake.controller.ts. Either remove this dead channel or implement it fully.

🤖 Prompt for AI Agents
In `@src/types/ipc.ts` around lines 264 - 273, The union includes the channel
'ducklake:table:queryChanges' but there is no service/controller/IPC handler for
it; either remove 'ducklake:table:queryChanges' from the IPC union in the types,
or implement it by adding a service method (e.g., add queryChanges(...) to
DuckLakeService), a controller handler (e.g.,
duckLakeController.handleQueryChanges that calls DuckLakeService.queryChanges),
and wire the controller to the IPC dispatcher/registration so the
'ducklake:table:queryChanges' channel is handled consistently with other table
mutation channels like updateRows/deleteRows/upsertRows.

@Nuri1977 Nuri1977 changed the title Feature/ducklake crud operations Feature/ducklake crude operations Feb 6, 2026
@Nuri1977 Nuri1977 changed the title Feature/ducklake crude operations Feature/ducklake crude operations nad data preview Feb 6, 2026
@Nuri1977 Nuri1977 changed the title Feature/ducklake crude operations nad data preview Feature/ducklake crude operations and data preview Feb 6, 2026
…aining and injection, replacing inline query validation in `updateTable`, `deleteTable`, and `upsertTable`.
…led file/row statistics in the DuckDB adapter and UI.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/main/services/duckLake.service.ts`:
- Around line 75-88: The SQL comment detection currently only matches `--` at
line ends; update the suspiciousPatterns/check to catch inline `--` comments too
by either adding a pattern like /--/m or /--(?=\s|$)/ to the suspiciousPatterns
array or, preferably, strip SQL comments (both `--` single-line and `/*...*/`
multi-line) from queryWithoutTerminalSemicolon before running the
prefix/validation checks; make sure the code that throws
DuckLakeError.validation('Query contains suspicious patterns that may indicate
SQL injection') still references the same suspiciousPatterns and
queryWithoutTerminalSemicolon so all inline `--` occurrences are detected and
cause the validation error.

In `@src/main/services/duckLake/adapters/duckdb.adapter.ts`:
- Around line 87-105: The restoreSnapshot implementation uses invalid DuckDB
syntax and will lose DuckLake lineage by recreating the table; update the
restore logic in the restoreSnapshot method (and the analogous method in
postgresql.adapter.ts) to use DuckLake time-travel syntax (SELECT ... AT
(VERSION => snapshotId)) instead of `FOR SYSTEM_VERSION AS OF`, and restore data
by inserting the time-traveled rows into the existing table to preserve row
lineage (e.g., run a transaction that truncates or deletes current rows from the
target table and then runs an INSERT INTO "<escapedName>" SELECT * FROM
"<escapedName>" AT (VERSION => snapshotId) via connectionInfo.connection.run),
ensuring you keep the escapedName handling and error handling around
connectionInfo and rethrow on failure.
🧹 Nitpick comments (3)
src/main/services/duckLake.service.ts (1)

112-174: Significant code duplication across DDL methods.

Every method repeats the same pattern: validate inputs → listTables → check existence → getTableDetails → check columns → call adapter. Consider extracting shared helpers like ensureTableExists(adapter, tableName) and getActiveColumn(adapter, tableName, columnName) to reduce boilerplate and make future changes easier.

Also applies to: 176-229, 231-302, 304-367, 369-425, 427-465

src/renderer/components/dataLake/DataLakeInstanceDetails.tsx (1)

763-784: Local safeToString duplicates the shared utility in src/renderer/helpers/utils.ts.

This component defines its own safeToString with a slightly different implementation (it additionally handles BigInt). Consider consolidating into the shared utility — add the BigInt handling there and import it here to avoid divergence.

src/renderer/components/dataLake/tableDetails/TableHistoryTab.tsx (1)

28-32: tableDetails prop is typed as any.

Consider using the DuckLakeTableDetails type from src/types/duckLake.ts for better type safety, which would catch issues like accessing tableDetails.snapshots when it might not exist.

Comment on lines +75 to +88
// Additional check: ensure no suspicious patterns that could indicate injection
// Check for common SQL injection patterns after semicolon removal
const suspiciousPatterns = [
/--\s*$/m, // SQL comments at the end
/\/\*.*\*\//s, // Multi-line comments
];

suspiciousPatterns.forEach((pattern) => {
if (pattern.test(queryWithoutTerminalSemicolon)) {
throw DuckLakeError.validation(
'Query contains suspicious patterns that may indicate SQL injection',
);
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

SQL comment detection is incomplete — inline -- comments are not caught.

The regex /--\s*$/m only matches -- at the end of a line. A query like UPDATE t SET x=1 -- comment\nDROP TABLE t would pass all checks because the semicolon check was already stripped by the earlier logic and the -- isn't at EOL in the trimmed single-statement form. Consider anchoring on -- anywhere (not just at line end) or stripping comments before prefix validation.

🤖 Prompt for AI Agents
In `@src/main/services/duckLake.service.ts` around lines 75 - 88, The SQL comment
detection currently only matches `--` at line ends; update the
suspiciousPatterns/check to catch inline `--` comments too by either adding a
pattern like /--/m or /--(?=\s|$)/ to the suspiciousPatterns array or,
preferably, strip SQL comments (both `--` single-line and `/*...*/` multi-line)
from queryWithoutTerminalSemicolon before running the prefix/validation checks;
make sure the code that throws DuckLakeError.validation('Query contains
suspicious patterns that may indicate SQL injection') still references the same
suspiciousPatterns and queryWithoutTerminalSemicolon so all inline `--`
occurrences are detected and cause the validation error.

Comment on lines +87 to +105
async restoreSnapshot(tableName: string, snapshotId: number): Promise<void> {
try {
if (!this.connectionInfo) {
throw new Error('No active connection');
}

const escapedName = tableName.replace(/"/g, '""');
await this.connectionInfo.connection.run(
`CREATE OR REPLACE TABLE "${escapedName}" AS SELECT * FROM "${escapedName}" FOR SYSTEM_VERSION AS OF ${snapshotId}`,
);
} catch (error) {
// eslint-disable-next-line no-console
console.error(
`Failed to restore DuckDB table ${tableName} to snapshot ${snapshotId}:`,
error,
);
throw error;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

DuckLake snapshot restore preserve lineage metadata

💡 Result:

In DuckLake, “restoring” a snapshot is typically done via time travel (querying or attaching the catalog at an older snapshot), not by rewriting the current head in-place. When you do this, you’re reading the exact historical metadata + data state recorded by that snapshot. [1]

  • Read/query an old snapshot (preserves the snapshot’s state/metadata):

    • Table time travel: SELECT * FROM my_table AT (VERSION => <snapshot_id>); [1]
    • Catalog attach at a snapshot: ATTACH 'ducklake:metadata.duckdb' (SNAPSHOT_VERSION <snapshot_id>); [1]
      Attaching with SNAPSHOT_VERSION/SNAPSHOT_TIME is read-only, so it won’t “move HEAD”; it just exposes that historical state for querying/exporting. [2]
  • Row lineage (“rowid”) preservation:
    DuckLake assigns each row a unique rowid on first insert and preserves it when rows are moved between files (e.g., UPDATE/compaction). So if you access an older snapshot via time travel, the rowid lineage for rows in that table history is preserved. [3]

  • What breaks lineage/history:
    If you “restore” by re-inserting data into a new table (or rewriting tables as new inserts), those rows get new rowids (new lineage). (Because rowid is assigned on first insert.) [3]
    If you expire snapshots, you delete historical snapshots (and thus the ability to time travel to them / retain their historical view). [4]

Rule of thumb: use time travel/attach-at-snapshot to preserve historical state and lineage; use “copy data out then reinsert” only if you accept losing original row lineage. [1][3][2]

Sources: [1] Time Travel – DuckLake (ducklake.select), [2] DuckLake attachment access modes (snapshot attach forces read-only) (deepwiki.com), [3] Row Lineage – DuckLake (ducklake.select), [4] Expire Snapshots – DuckLake (ducklake.select)

Citations:


🌐 Web query:

DuckDB CREATE OR REPLACE TABLE metadata system_version_as_of

💡 Result:

DuckDB doesn’t have a special SYSTEM_VERSION_AS_OF / system_version_as_of clause for CREATE OR REPLACE TABLE. If your goal is to materialize an Iceberg table “as of” a snapshot into a (local) DuckDB table, do it with CTAS + Iceberg time-travel parameters:

-- optional: keep these tables in a separate schema
CREATE SCHEMA IF NOT EXISTS metadata;

-- pick a snapshot_id (see iceberg_snapshots() below), then materialize it
CREATE OR REPLACE TABLE metadata.system_version_as_of AS
SELECT *
FROM iceberg_scan('path_or_table', snapshot_from_id = 7635660646343998149);

To discover available snapshots:

SELECT *
FROM iceberg_snapshots('path_or_table');

Notes:

  • CREATE OR REPLACE TABLE ... AS SELECT ... overwrites the table if it already exists. [1]
  • Iceberg “as of” is done via snapshot_from_id or snapshot_from_timestamp parameters (not a SQL ... AS OF ... clause). [2]

Sources: DuckDB CREATE OR REPLACE TABLE docs [1], DuckDB Iceberg extension docs (snapshot parameters + iceberg_snapshots) [2].

[1] (duckdb.org)
[2] (duckdb.org)

Citations:


🏁 Script executed:

# Search for DuckLake-related code to understand metadata/lineage handling
rg -i "duckLake|lineage|metadata" src/main/services/duckLake --type ts -A 3 -B 1

Repository: rosettadb/dbt-studio

Length of output: 50377


🏁 Script executed:

# Look for snapshot-related code and tests
find . -type f -name "*snapshot*" | head -20

Repository: rosettadb/dbt-studio

Length of output: 46


🏁 Script executed:

# Check for DuckLake documentation or comments about restore behavior
rg -i "restore|snapshot" src/main/services/duckLake --type ts -B 2 -A 5

Repository: rosettadb/dbt-studio

Length of output: 50377


Invalid DuckDB syntax and wrong restore semantic — fix required.

The FOR SYSTEM_VERSION AS OF syntax is not valid DuckDB SQL. DuckDB doesn't support temporal table syntax; it uses the DuckLake extension with time-travel semantics. The correct approach is to use SELECT * FROM table_name AT (VERSION => snapshot_id) or attach at a specific snapshot version.

Additionally, CREATE OR REPLACE TABLE ... AS SELECT * loses DuckLake row lineage — each re-inserted row gets a new rowid rather than preserving the original. Per DuckLake design, snapshot restore should use time-travel queries instead of table replacement.

This pattern appears in both duckdb.adapter.ts and postgresql.adapter.ts.

🤖 Prompt for AI Agents
In `@src/main/services/duckLake/adapters/duckdb.adapter.ts` around lines 87 - 105,
The restoreSnapshot implementation uses invalid DuckDB syntax and will lose
DuckLake lineage by recreating the table; update the restore logic in the
restoreSnapshot method (and the analogous method in postgresql.adapter.ts) to
use DuckLake time-travel syntax (SELECT ... AT (VERSION => snapshotId)) instead
of `FOR SYSTEM_VERSION AS OF`, and restore data by inserting the time-traveled
rows into the existing table to preserve row lineage (e.g., run a transaction
that truncates or deletes current rows from the target table and then runs an
INSERT INTO "<escapedName>" SELECT * FROM "<escapedName>" AT (VERSION =>
snapshotId) via connectionInfo.connection.run), ensuring you keep the
escapedName handling and error handling around connectionInfo and rethrow on
failure.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/services/duckLake/adapters/postgresql.adapter.ts (1)

901-941: ⚠️ Potential issue | 🟡 Minor

PostgreSQL adapter missing author/commitMessage columns in listInstanceSnapshots.

The DuckDB adapter dynamically detects and includes optional author, commit_message, and commit_extra_info columns via getAvailableSnapshotColumns(), but the PostgreSQL adapter's listInstanceSnapshots and getTableDetails snapshot queries omit them entirely. This creates an inconsistent API response shape across adapters.

Consider adding the same getAvailableSnapshotColumns helper and optional column detection to keep the adapter outputs uniform.

🤖 Fix all issues with AI agents
In `@src/main/services/duckLake/adapters/base.adapter.ts`:
- Around line 547-576: sanitizeDefaultValue currently uses case-sensitive
substring checks and an overly broad blocklist (e.g., includes('DROP'),
includes('CREATE')) which can be bypassed or wrongly reject benign defaults;
update sanitizeDefaultValue to either (A) perform case-insensitive,
word-boundary checks (e.g., compare defaultValue.toLowerCase() against regexes
like /\b(drop|delete|insert|update|create|alter)\b/) to avoid matching benign
substrings, or (preferred) remove the manual blocklist and instead strictly
validate against the allowed literal pattern used in validateColumnType (reuse
or mirror the simpleLiteralPattern logic) and reject anything that does not
match the expected literal form; ensure that when treating values as string
literals you still escape single quotes and wrap in quotes as currently done in
sanitizeDefaultValue.

In `@src/main/services/duckLake/adapters/duckdb.adapter.ts`:
- Around line 365-405: The PRAGMA call in getAvailableSnapshotColumns
interpolates metadataDatabase directly into a single-quoted string, risking SQL
injection if the name contains a quote; change the call to use the safe metadata
DB naming used elsewhere (e.g., getMetadataDatabaseName()) or escape/quote the
identifier with double-quotes instead of single quotes, and build the PRAGMA
string using the safe identifier so table_info gets called on
"metadataDatabase". Ensure you update the variable used in the PRAGMA query
inside getAvailableSnapshotColumns and keep the rest of the result handling
unchanged.
🧹 Nitpick comments (1)
src/main/services/duckLake/adapters/base.adapter.ts (1)

517-540: getMetadataDatabaseName — good extraction of repeated logic.

This helper consolidates the metadata database lookup pattern that's duplicated across many methods in the concrete adapters. However, note that most methods in the concrete adapters still inline this same lookup rather than calling this helper.

Comment on lines +547 to +576
protected sanitizeDefaultValue(defaultValue: string): string {
// Check for dangerous SQL patterns
if (
defaultValue.includes(';') ||
defaultValue.includes('--') ||
defaultValue.includes('/*') ||
defaultValue.includes('*/') ||
defaultValue.includes('DROP') ||
defaultValue.includes('DELETE') ||
defaultValue.includes('INSERT') ||
defaultValue.includes('UPDATE') ||
defaultValue.includes('CREATE') ||
defaultValue.includes('ALTER')
) {
throw new Error(
'Invalid default value: contains potentially dangerous SQL patterns',
);
}

// Check if it's a simple literal (number, boolean, NULL)
const simpleLiteralPattern =
/^(NULL|TRUE|FALSE|-?\d+(\.\d+)?|CURRENT_TIMESTAMP|CURRENT_DATE|CURRENT_TIME)$/i;
if (simpleLiteralPattern.test(defaultValue.trim())) {
return defaultValue.trim();
}

// Treat as string literal: escape single quotes and wrap in quotes
const escapedValue = defaultValue.replace(/'/g, "''");
return `'${escapedValue}'`;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

sanitizeDefaultValue blocklist checks are case-sensitive — easily bypassed and overly broad.

The keyword checks on lines 554–559 use exact-case includes('DROP'), includes('DELETE'), etc., so drop table... in lowercase passes through. Meanwhile validateColumnType correctly uses .toLowerCase(). Additionally, the blocklist approach will reject legitimate default values containing these substrings (e.g., a string default of "CREATED_AT" is blocked by includes('CREATE')).

Consider either switching to .toLowerCase() for consistency and using word-boundary checks, or better yet, relying on the same pattern as validateColumnType — validate that the value matches an expected literal pattern and reject everything else:

Proposed fix sketch
   protected sanitizeDefaultValue(defaultValue: string): string {
-    // Check for dangerous SQL patterns
-    if (
-      defaultValue.includes(';') ||
-      defaultValue.includes('--') ||
-      defaultValue.includes('/*') ||
-      defaultValue.includes('*/') ||
-      defaultValue.includes('DROP') ||
-      defaultValue.includes('DELETE') ||
-      defaultValue.includes('INSERT') ||
-      defaultValue.includes('UPDATE') ||
-      defaultValue.includes('CREATE') ||
-      defaultValue.includes('ALTER')
-    ) {
+    const lower = defaultValue.toLowerCase();
+    // Check for dangerous SQL patterns (case-insensitive, word-boundary)
+    const dangerousPatterns = /[;]|--|\/\*|\*\/|\b(drop|delete|insert|update|create|alter)\b/i;
+    if (dangerousPatterns.test(defaultValue)) {
       throw new Error(
         'Invalid default value: contains potentially dangerous SQL patterns',
       );
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
protected sanitizeDefaultValue(defaultValue: string): string {
// Check for dangerous SQL patterns
if (
defaultValue.includes(';') ||
defaultValue.includes('--') ||
defaultValue.includes('/*') ||
defaultValue.includes('*/') ||
defaultValue.includes('DROP') ||
defaultValue.includes('DELETE') ||
defaultValue.includes('INSERT') ||
defaultValue.includes('UPDATE') ||
defaultValue.includes('CREATE') ||
defaultValue.includes('ALTER')
) {
throw new Error(
'Invalid default value: contains potentially dangerous SQL patterns',
);
}
// Check if it's a simple literal (number, boolean, NULL)
const simpleLiteralPattern =
/^(NULL|TRUE|FALSE|-?\d+(\.\d+)?|CURRENT_TIMESTAMP|CURRENT_DATE|CURRENT_TIME)$/i;
if (simpleLiteralPattern.test(defaultValue.trim())) {
return defaultValue.trim();
}
// Treat as string literal: escape single quotes and wrap in quotes
const escapedValue = defaultValue.replace(/'/g, "''");
return `'${escapedValue}'`;
}
protected sanitizeDefaultValue(defaultValue: string): string {
const lower = defaultValue.toLowerCase();
// Check for dangerous SQL patterns (case-insensitive, word-boundary)
const dangerousPatterns = /[;]|--|\/\*|\*\/|\b(drop|delete|insert|update|create|alter)\b/i;
if (dangerousPatterns.test(defaultValue)) {
throw new Error(
'Invalid default value: contains potentially dangerous SQL patterns',
);
}
// Check if it's a simple literal (number, boolean, NULL)
const simpleLiteralPattern =
/^(NULL|TRUE|FALSE|-?\d+(\.\d+)?|CURRENT_TIMESTAMP|CURRENT_DATE|CURRENT_TIME)$/i;
if (simpleLiteralPattern.test(defaultValue.trim())) {
return defaultValue.trim();
}
// Treat as string literal: escape single quotes and wrap in quotes
const escapedValue = defaultValue.replace(/'/g, "''");
return `'${escapedValue}'`;
}
🤖 Prompt for AI Agents
In `@src/main/services/duckLake/adapters/base.adapter.ts` around lines 547 - 576,
sanitizeDefaultValue currently uses case-sensitive substring checks and an
overly broad blocklist (e.g., includes('DROP'), includes('CREATE')) which can be
bypassed or wrongly reject benign defaults; update sanitizeDefaultValue to
either (A) perform case-insensitive, word-boundary checks (e.g., compare
defaultValue.toLowerCase() against regexes like
/\b(drop|delete|insert|update|create|alter)\b/) to avoid matching benign
substrings, or (preferred) remove the manual blocklist and instead strictly
validate against the allowed literal pattern used in validateColumnType (reuse
or mirror the simpleLiteralPattern logic) and reject anything that does not
match the expected literal form; ensure that when treating values as string
literals you still escape single quotes and wrap in quotes as currently done in
sanitizeDefaultValue.

Comment on lines +365 to +405
private async getAvailableSnapshotColumns(metadataDatabase: string): Promise<{
hasAuthor: boolean;
hasCommitMessage: boolean;
hasCommitExtraInfo: boolean;
}> {
try {
// Check available columns in ducklake_snapshot_changes
// We use the database name directly in the PRAGMA string
if (!this.connectionInfo) {
return {
hasAuthor: false,
hasCommitMessage: false,
hasCommitExtraInfo: false,
};
}
const tableInfoQuery = `PRAGMA table_info('${metadataDatabase}.main.ducklake_snapshot_changes')`;
const result = await this.connectionInfo.connection.run(tableInfoQuery);
const rows = await result.getRows();

const columnNames = rows
.map((row: any) => (Array.isArray(row) ? row[1] : row.name))
.map((name: string) => name.toLowerCase());

return {
hasAuthor: columnNames.includes('author'),
hasCommitMessage: columnNames.includes('commit_message'),
hasCommitExtraInfo: columnNames.includes('commit_extra_info'),
};
} catch (error) {
// eslint-disable-next-line no-console
console.warn(
'Failed to check table info for ducklake_snapshot_changes',
error,
);
return {
hasAuthor: false,
hasCommitMessage: false,
hasCommitExtraInfo: false,
};
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

getAvailableSnapshotColumns — potential SQL injection in PRAGMA call.

Line 380: the metadataDatabase value is interpolated into a single-quoted string passed to PRAGMA table_info(...). If the database name contains a single quote, this would break. Use the same double-quote escaping approach used elsewhere, or use the getMetadataDatabaseName() helper.

Proposed fix
-      const tableInfoQuery = `PRAGMA table_info('${metadataDatabase}.main.ducklake_snapshot_changes')`;
+      const safeName = metadataDatabase.replace(/'/g, "''");
+      const tableInfoQuery = `PRAGMA table_info('${safeName}.main.ducklake_snapshot_changes')`;
🤖 Prompt for AI Agents
In `@src/main/services/duckLake/adapters/duckdb.adapter.ts` around lines 365 -
405, The PRAGMA call in getAvailableSnapshotColumns interpolates
metadataDatabase directly into a single-quoted string, risking SQL injection if
the name contains a quote; change the call to use the safe metadata DB naming
used elsewhere (e.g., getMetadataDatabaseName()) or escape/quote the identifier
with double-quotes instead of single quotes, and build the PRAGMA string using
the safe identifier so table_info gets called on "metadataDatabase". Ensure you
update the variable used in the PRAGMA query inside getAvailableSnapshotColumns
and keep the rest of the result handling unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant