Skip to content

Conversation

@mchataigner
Copy link
Contributor

Reference: https://arrow.apache.org/docs/format/CanonicalExtensions.html#uuid

Rely on FixedSizeBinary(16) as the arrow format as mentionned in the spec

@gooroo-dev
Copy link

gooroo-dev bot commented Nov 6, 2025

Please double check the following review of the pull request:

🐞Mistake 🤪Typo 🚨Security 🚀Performance 💪Best Practices 📖Readability ❓Others
1 0 0 0 1 1 0

Changes in the diff

  • ➕ Added support for uuid::Uuid in ArrowDeserialize trait.
  • ➕ Added support for uuid::Uuid in ArrowSerialize trait.
  • ➕ Added ArrowField implementation for uuid::Uuid.
  • ➕ Added uuid dependency to Cargo.toml and arrow_convert/Cargo.toml.
  • ➕ Added comprehensive tests for serialization and deserialization of Uuid and Option<Uuid>.
  • 💪 Improved error handling practice is recommended in deserialization (currently uses unwrap()).
  • 📖 Added use uuid; imports explicitly for clarity.

Identified Issues

ID Type Details Severity Confidence
001 🐞Mistake arrow_deserialize implementation for Uuid uses unwrap() on Uuid::from_slice which can panic if input slice is invalid. See arrow_convert/src/deserialize/mod.rs lines 314-319. 🔴High 🟠Medium
002 💪Best Practices use uuid; imports are added multiple times in different modules; consider consistent import style or aliasing for clarity. See arrow_convert/src/deserialize/mod.rs, field.rs, serialize/mod.rs. 🟡Low 🟡Low
003 📖Readability The tests in test_serialize.rs and test_uuid.rs have some duplication; consider consolidating tests or sharing common test utilities. 🟡Low 🟡Low

Issue 001: Unsafe unwrap in arrow_deserialize for Uuid

The implementation of ArrowDeserialize for uuid::Uuid calls uuid::Uuid::from_slice(t).unwrap() inside arrow_deserialize. If the byte slice t is not exactly 16 bytes or invalid, this will panic, which is undesirable in deserialization.

Fix

Replace unwrap() with proper error handling. Since arrow_deserialize returns Option<Self>, a safe approach is to return None if deserialization fails.

Suggested code change:

impl ArrowDeserialize for uuid::Uuid {
    type ArrayType = FixedSizeBinaryArray;

    #[inline]
    fn arrow_deserialize(v: Option<&[u8]>) -> Option<Self> {
        v.and_then(|t| uuid::Uuid::from_slice(t).ok())
    }
}

Explanation

This change avoids panics by returning None when the slice is invalid, making deserialization safer and more robust.


Issue 002: Multiple use uuid; imports

The uuid crate is imported explicitly in several modules (deserialize/mod.rs, field.rs, serialize/mod.rs). While not a critical issue, it is better to maintain consistent import style and possibly alias the import for clarity.

Fix

Consider importing uuid once at the crate root or consistently aliasing it, e.g.:

use uuid as uuid_crate;

and then use uuid_crate::Uuid everywhere.

Explanation

This improves code clarity and maintainability.


Issue 003: Test code duplication

The tests for UUID serialization and deserialization appear both in test_serialize.rs and the new test_uuid.rs file with overlapping content.

Fix

Consolidate tests into a single file or extract common test utilities to reduce duplication.

Explanation

This reduces maintenance effort and keeps tests DRY (Don't Repeat Yourself).


Missing tests

The PR includes comprehensive tests for serialization and deserialization of Uuid and Option<Uuid>, including iterator deserialization. No additional tests are strictly necessary.


Summary

The PR is well done and adds useful support for Uuid serialization/deserialization with Arrow. The main concern is the unsafe unwrap() in deserialization, which should be replaced with safe error handling. Minor improvements in import style and test consolidation are recommended but not critical.

Summon me to re-review when updated! Yours, Gooroo.dev
I'd love a reaction or reply to know your thoughts.

Reference: https://arrow.apache.org/docs/format/CanonicalExtensions.html#uuid

Rely on FixedSizeBinary(16) as the arrow format as mentionned in the spec
glam = { workspace = true, optional = true }
rust_decimal = { workspace = true, optional = true }
tinystr = { workspace = true, optional = true }
uuid = { workspace = true }
Copy link

@redox redox Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mchataigner we might want to set this optional = true if part of this block

@Swoorup
Copy link
Owner

Swoorup commented Nov 7, 2025

@mchataigner Thanks for the changes. Happy to have uuid support. However we want to keep core with fewer dependencies are possible. So as @redox mentioned this should an optional feature.

Take a look at how glam, rust-decimal are implemented here, if you need an example.

@mchataigner
Copy link
Contributor Author

@Swoorup and @redox Thanks for your feedback, I applied them in the latest commit.

@Swoorup Swoorup requested a review from Copilot November 7, 2025 10:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds UUID support to the arrow_convert library, enabling serialization and deserialization of UUID values to/from Apache Arrow's FixedSizeBinary(16) format.

Key changes:

  • Implementation of ArrowField, ArrowSerialize, and ArrowDeserialize traits for uuid::Uuid
  • Comprehensive test suite covering serialization, deserialization, null handling, and iterator-based operations
  • Feature-gated dependency on the uuid crate (version 1.18.1)

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
arrow_convert/src/features/uuid.rs Implements trait support for UUID type using FixedSizeBinary(16) Arrow representation
arrow_convert/src/features/mod.rs Registers the uuid feature module under conditional compilation
arrow_convert/tests/test_uuid.rs Comprehensive test suite for UUID serialization/deserialization including null handling
arrow_convert/Cargo.toml Adds optional uuid dependency and corresponding feature flag
Cargo.toml Declares uuid 1.18.1 as a workspace dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Default was required for the builder, so I created a thin wrapper on FixedSizeBinaryBuilder adding Default trait on it
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

arrow_convert/Cargo.toml:44

  • The uuid dependency is missing from [dev-dependencies]. Other optional feature dependencies (glam, rust_decimal, tinystr) are included in dev-dependencies to support their respective tests. Add uuid = { workspace = true } to the dev-dependencies section to be consistent with the pattern and ensure tests can run.
[dev-dependencies]
arrow = { workspace = true }
arrow_convert_derive = { workspace = true }
criterion = { workspace = true }
glam = { workspace = true }
pretty_assertions = { workspace = true }
rust_decimal = { workspace = true }
tinystr = { workspace = true }
trybuild = { workspace = true }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mchataigner mchataigner requested a review from Copilot November 7, 2025 18:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Swoorup Swoorup merged commit 4ed3ad6 into Swoorup:main Nov 9, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants