Skip to content

Conversation

@sqlerrorthing
Copy link
Owner

Closes #38

@sqlerrorthing sqlerrorthing added this to the 1.0.0 milestone Jun 2, 2025
@sqlerrorthing sqlerrorthing added the core The stealer core functionality label Jun 2, 2025
@sqlerrorthing
Copy link
Owner Author

@sourcery-ai review

@sourcery-ai
Copy link

sourcery-ai bot commented Jul 29, 2025

Reviewer's Guide

This PR replaces the previous C-bindings based SQLite reader with a self-contained, pure Rust implementation under a new sqlite module, refactors core database types to use Arc-backed storage, updates the DatabaseReader API and downstream browser extractors accordingly, and adds necessary dependencies and tests.

ER diagram for new SQLite table metadata extraction

erDiagram
    TABLE_METADATA {
        name string
        columns ColumnDef[]
        first_page int
    }
    COLUMN_DEF {
        name string
        col_type Type
    }
    TABLE_METADATA ||--o{ COLUMN_DEF : has
    COLUMN_DEF }o--|| TYPE : uses
Loading

Class diagram for new SQLite pure Rust implementation

classDiagram
    class SqliteDatabase {
        - tables_metadata: Vec<TableMetadata>
        - pager: Pager
        + scanner(page: usize) Scanner
        + read_table<S>(table_name: S) Option<SqliteTableIterator>
    }
    class TableMetadata {
        + name: Arc<str>
        + columns: Vec<ColumnDef>
        + first_page: usize
        + from_cursor(cursor: Cursor) -> Option<TableMetadata>
    }
    class SqliteTableIterator {
        - fields: Vec<usize>
        - scanner: Scanner
        - row_buffer: Vec<Value>
        + next() Option<SqliteTableRecord>
    }
    class SqliteTableRecord {
        - row: Arc<[Value]>
        + get_value(key: usize) Option<Value>
    }
    class Pager {
        + read_page(n: usize) Arc<Page>
        + read_overflow(n: usize) Arc<OverflowPage>
    }
    class Scanner {
        + next_record() Option<Cursor>
    }
    class Cursor {
        + field(n: usize) Option<Value>
    }
    class Value {
        + String(Arc<str>)
        + Integer(i64)
        + Float(f64)
        + Blob(Arc<[u8]>)
        + Null
        + as_str() Option<Arc<str>>
        + as_integer() Option<i64>
        + as_blob() Option<Arc<[u8]>>
    }
    SqliteDatabase --> TableMetadata
    SqliteDatabase --> Pager
    SqliteDatabase --> SqliteTableIterator
    SqliteTableIterator --> Scanner
    SqliteTableIterator --> SqliteTableRecord
    SqliteTableRecord --> Value
    Pager --> Page
    Pager --> OverflowPage
    Scanner --> Cursor
    Cursor --> Value
    TableMetadata --> ColumnDef
    ColumnDef --> Type
Loading

Class diagram for refactored Value and browser data types

classDiagram
    class Value {
        + String(Arc<str>)
        + Integer(i64)
        + Float(f64)
        + Blob(Arc<[u8]>)
        + Null
        + as_str() Option<Arc<str>>
        + as_integer() Option<i64>
        + as_blob() Option<Arc<[u8]>>
    }
    class Cookie {
        + host_key: Arc<str>
        + name: Arc<str>
        + value: Arc<str>
        + path: Arc<str>
        + expires_utc: i64
    }
    class AutoFill {
        + name: Arc<str>
        + value: Arc<str>
        + last_used: i64
    }
    class CreditCard {
        + name_on_card: Arc<str>
        + expiration_month: i64
        + expiration_year: i64
        + card_number: Arc<str>
        + use_count: i64
    }
    class Download {
        + saved_as: Arc<str>
        + url: Arc<str>
    }
    class Password {
        + origin: Option<Arc<str>>
        + username: Option<Arc<str>>
        + password: Option<Arc<str>>
    }
    class History {
        + url: Arc<str>
        + title: Arc<str>
        + last_visit_time: i64
    }
Loading

Class diagram for new half-io VecReader utility

classDiagram
    class VecReader {
        - buffer: Vec<u8>
        - pos: usize
        + new(data: Vec<u8>)
        + read(buf: &mut [u8]) -> usize
        + seek(pos: SeekFrom) -> u64
    }
    VecReader ..|> Read
    VecReader ..|> Seek
    VecReader ..|> ErrorType
Loading

File-Level Changes

Change Details Files
Introduce pure Rust SQLite reader
  • Add sqlite module with db, pager, page, cursor, and sql submodules implementing SQLite file parsing and iteration
  • Implement SqliteDatabase::try_from(Vec<u8>) and remove C bindings
  • Use half-io and embedded-io for in-memory I/O
database/src/lib.rs
database/src/sqlite/db.rs
database/src/sqlite/pager.rs
database/src/sqlite/page.rs
database/src/sqlite/cursor.rs
database/src/sqlite/sql/parser.rs
database/src/sqlite/sql/tokenizer.rs
database/src/sqlite/sql/ast.rs
half-io/src/lib.rs
Refactor Value enum to use Arc
  • Change Value::String and Value::Blob to Arc<str> and Arc<[u8]>
  • Update as_str/as_blob methods to return owned Arc clones
  • Simplify Display implementation using Rust 1.58+ formatting
database/src/lib.rs
Update DatabaseReader and TableRecord APIs
  • Remove from_bytes and from_path from DatabaseReader
  • Change read_from_bytes to consume Vec<u8> and return anyhow::Error
  • Modify TableRecord::get_value to return Option<Value>
database/src/lib.rs
Adjust browser extractors for new API and Arc types
  • Replace .as_string() with .as_str() and wrap results in Arc
  • Pass slice references to decrypt_data instead of owned Vec<u8>
  • Change browser record struct fields from String to Arc<str>
shadowsniff/browsers/src/lib.rs
shadowsniff/browsers/src/chromium/cookies.rs
shadowsniff/browsers/src/chromium/passwords.rs
shadowsniff/browsers/src/chromium/creditcards.rs
shadowsniff/browsers/src/chromium/autofill.rs
shadowsniff/browsers/src/chromium/downloads.rs
shadowsniff/browsers/src/chromium/history.rs
Update dependencies and workspace configuration
  • Add anyhow, embedded-io, and half-io to database crate
  • Introduce new half-io crate and include it in workspace
  • Remove unused C build dependencies
database/Cargo.toml
Cargo.toml
half-io/Cargo.toml
Add tests and update documentation
  • Add integration test in database/src/lib.rs for loading test.db
  • Update README.MD stub size badge and credit the new SQLite reader
database/src/lib.rs
README.MD

Assessment against linked issues

Issue Objective Addressed Explanation
#38 Implement a custom SQLite reader in Rust, without relying on the sqlite3.c C implementation.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sqlerrorthing - I've reviewed your changes and found some issues that need to be addressed.

Blocking issues:

  • SqliteTableIterator::next() always returns None. (link)
  • read_i24_at uses i32::from_be_bytes on a 3-byte slice, which will panic. (link)
  • read_i48_at uses i64::from_be_bytes on a 6-byte slice, which will panic. (link)

General comments:

  • SqliteTableIterator.next currently returns None unconditionally—update it to return Some(SqliteTableRecord) after filling the row_buffer.
  • row_buffer is allocated with length zero (you capture its length before pushing fields) so size it using the number of fields (e.g. fields.len()) to match the number of columns.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- SqliteTableIterator.next currently returns None unconditionally—update it to return Some(SqliteTableRecord) after filling the row_buffer.
- row_buffer is allocated with length zero (you capture its length before pushing fields) so size it using the number of fields (e.g. fields.len()) to match the number of columns.

## Individual Comments

### Comment 1
<location> `database/src/sqlite/db.rs:83` </location>
<code_context>
-impl Iterator for SqliteIterator {
-    type Item = SqliteRow;
-
-    fn next(&mut self) -> Option<Self::Item> {
-        self.rows.next()
-    }
</code_context>

<issue_to_address>
SqliteTableIterator::next() always returns None.

Currently, next() does not yield any items, resulting in an empty iterator. Consider returning Some(SqliteTableRecord { ... }) after populating row_buffer.
</issue_to_address>

### Comment 2
<location> `half-io/src/lib.rs:42` </location>
<code_context>
+
+impl Read for VecReader {
+    fn read(&mut self, buf: &mut [u8]) -> Result<usize, Self::Error> {
+        let available = self.buffer.len().saturating_add(self.pos);
+        let to_read = core::cmp::min(available, buf.len());
+        
</code_context>

<issue_to_address>
Incorrect calculation of available bytes in VecReader::read().

The calculation should use buffer.len().saturating_sub(self.pos) to correctly determine the remaining bytes to read.
</issue_to_address>

### Comment 3
<location> `half-io/src/lib.rs:49` </location>
<code_context>
+            return Ok(0);
+        }
+        
+        buf[..to_read].copy_from_slice(&self.buffer[self.pos..self.pos + to_read]);
+        self.pos += to_read;
+        
</code_context>

<issue_to_address>
Potential out-of-bounds access in VecReader::read().

Ensure that self.pos + to_read does not exceed self.buffer.len() to prevent panics. Correcting the available bytes calculation should address this, but always validate slice bounds.
</issue_to_address>

### Comment 4
<location> `database/src/sqlite/cursor.rs:174` </location>
<code_context>
+    i16::from_be_bytes(input[offset..offset + 2].try_into().unwrap()) as i64
+}
+
+fn read_i24_at(input: &[u8], offset: usize) -> i64 {
+    (i32::from_be_bytes(input[offset..offset + 3].try_into().unwrap()) & 0x00FFFFFF) as i64
+}
</code_context>

<issue_to_address>
read_i24_at uses i32::from_be_bytes on a 3-byte slice, which will panic.

i32::from_be_bytes requires a 4-byte array; passing a 3-byte slice will cause a panic. Manually assemble the 24-bit value from the bytes instead.
</issue_to_address>

### Comment 5
<location> `database/src/sqlite/cursor.rs:182` </location>
<code_context>
+    i32::from_be_bytes(input[offset..offset + 4].try_into().unwrap()) as i64
+}
+
+fn read_i48_at(input: &[u8], offset: usize) -> i64 {
+    i64::from_be_bytes(input[offset..offset + 6].try_into().unwrap()) & 0x0000FFFFFFFFFFFF
+}
</code_context>

<issue_to_address>
read_i48_at uses i64::from_be_bytes on a 6-byte slice, which will panic.

You must pad the 6-byte slice to 8 bytes before calling i64::from_be_bytes to avoid a runtime panic.
</issue_to_address>

### Comment 6
<location> `database/src/sqlite/cursor.rs:148` </location>
<code_context>
+                record_field.offset,
+            ))),
+            RecordFieldType::String(length) => {
+                let value = str::from_utf8(
+                    &self.payload[record_field.offset..record_field.offset + length],
+                )
</code_context>

<issue_to_address>
No bounds check for payload slice in String/Blob field extraction.

Add a check to ensure record_field.offset + length does not exceed payload length to prevent panics with malformed data.
</issue_to_address>

### Comment 7
<location> `database/src/sqlite/pager.rs:124` </location>
<code_context>
+    }
+
+    fn load_raw(&self, n: usize) -> anyhow::Result<Vec<u8>> {
+        let offset = n.saturating_sub(1) * self.header.page_size as usize;
+
+        let mut input_guard = self
</code_context>

<issue_to_address>
Pager::load_raw() does not check for out-of-bounds page numbers.

Validate that n and offset are within valid bounds to prevent panics or invalid data access.
</issue_to_address>

### Comment 8
<location> `database/src/sqlite/sql/parser.rs:10` </location>
<code_context>
+use anyhow::{bail, Context};
+
+#[derive(Debug)]
+struct ParserState {
+    tokens: Vec<Token>,
+    pos: usize,
</code_context>

<issue_to_address>
Consider replacing the custom ParserState logic with a concise regex-based approach for parsing CREATE TABLE statements.

Here’s a sketch of how you can drop the entire `ParserState` machinery and replace it with a ~30 LOC regex + split approach. It covers the same functionality (extract table name + cols + types) and will be far easier to maintain:

```rust
use crate::sqlite::sql::ast::{ColumnDef, CreateTableStatement, Type};
use anyhow::{bail, Context, Result};
use regex::Regex;

lazy_static::lazy_static! {
    // case‐insensitive CREATE TABLE (name) (col1 TYPE, col2 TYPE, …)
    static ref CREATE_RE: Regex =
        Regex::new(r"(?i)^\s*CREATE\s+TABLE\s+([A-Za-z_]\w*)\s*\((.+)\)\s*$").unwrap();
}

pub fn parse_create_statement(input: &str) -> Result<CreateTableStatement> {
    let caps = CREATE_RE
        .captures(input)
        .context("not a CREATE TABLE statement")?;

    let name = caps[1].to_string();
    let cols = &caps[2];
    let columns = cols
        .split(',')
        .map(str::trim)
        .map(|col_def| {
            let mut parts = col_def.split_whitespace();
            let col_name = parts
                .next()
                .context("missing column name")?
                .to_string();
            let typ = parts
                .next()
                .map(str::to_lowercase)
                .context("missing column type")?;
            let col_type = match typ.as_str() {
                "integer" => Type::Integer,
                "real"    => Type::Real,
                "blob"    => Type::Blob,
                "text" | "string" => Type::Text,
                other => bail!("unsupported type: {}", other),
            };
            Ok(ColumnDef {
                name: col_name,
                col_type,
            })
        })
        .collect::<Result<Vec<_>>>()?;

    Ok(CreateTableStatement { name, columns })
}
```

Steps to adopt:
1. Remove `ParserState`, tokenizer imports and all its helper methods.
2. Add the `CREATE_RE` regex above.
3. Replace your existing `parse_create_statement` with the snippet above.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

impl Iterator for SqliteTableIterator {
type Item = SqliteTableRecord;

fn next(&mut self) -> Option<Self::Item> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): SqliteTableIterator::next() always returns None.

Currently, next() does not yield any items, resulting in an empty iterator. Consider returning Some(SqliteTableRecord { ... }) after populating row_buffer.


impl Read for VecReader {
fn read(&mut self, buf: &mut [u8]) -> Result<usize, Self::Error> {
let available = self.buffer.len().saturating_add(self.pos);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Incorrect calculation of available bytes in VecReader::read().

The calculation should use buffer.len().saturating_sub(self.pos) to correctly determine the remaining bytes to read.

return Ok(0);
}

buf[..to_read].copy_from_slice(&self.buffer[self.pos..self.pos + to_read]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Potential out-of-bounds access in VecReader::read().

Ensure that self.pos + to_read does not exceed self.buffer.len() to prevent panics. Correcting the available bytes calculation should address this, but always validate slice bounds.

i16::from_be_bytes(input[offset..offset + 2].try_into().unwrap()) as i64
}

fn read_i24_at(input: &[u8], offset: usize) -> i64 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): read_i24_at uses i32::from_be_bytes on a 3-byte slice, which will panic.

i32::from_be_bytes requires a 4-byte array; passing a 3-byte slice will cause a panic. Manually assemble the 24-bit value from the bytes instead.

i32::from_be_bytes(input[offset..offset + 4].try_into().unwrap()) as i64
}

fn read_i48_at(input: &[u8], offset: usize) -> i64 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): read_i48_at uses i64::from_be_bytes on a 6-byte slice, which will panic.

You must pad the 6-byte slice to 8 bytes before calling i64::from_be_bytes to avoid a runtime panic.

record_field.offset,
))),
RecordFieldType::String(length) => {
let value = str::from_utf8(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): No bounds check for payload slice in String/Blob field extraction.

Add a check to ensure record_field.offset + length does not exceed payload length to prevent panics with malformed data.

}

fn load_raw(&self, n: usize) -> anyhow::Result<Vec<u8>> {
let offset = n.saturating_sub(1) * self.header.page_size as usize;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Pager::load_raw() does not check for out-of-bounds page numbers.

Validate that n and offset are within valid bounds to prevent panics or invalid data access.

@sqlerrorthing sqlerrorthing modified the milestones: 1.0.0, 2.0.0, 2.1.0 Aug 4, 2025
@sqlerrorthing sqlerrorthing modified the milestones: 2.1.0, 2.2.0 Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core The stealer core functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Selfcode sqlite reader

2 participants