Skip to content

Conversation

@Richard-31415
Copy link
Contributor

@Richard-31415 Richard-31415 commented Jan 31, 2026

Summary by CodeRabbit

  • New Features

    • Added JSON validation for uploads to prevent invalid data from being stored.
    • Added structured data models to improve schema validation and data handling.
  • Chores

    • Added Pydantic dependency to support validation.
    • Reorganized package layout and imports for improved maintainability.

✏️ Tip: You can customize this high-level summary in your review settings.

@vercel
Copy link

vercel bot commented Jan 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cmumaps Ready Ready Preview Jan 31, 2026 10:57pm
cmumaps-visualizer Ready Ready Preview Jan 31, 2026 10:57pm

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Jan 31, 2026

📝 Walkthrough

Walkthrough

Adds a new Pydantic-based models package, wires those models into S3Client JSON validation and upload flow, updates project metadata to include pydantic and the models package in the wheel, and moves a couple of unconditional runtime imports/refactors in utility modules and API client placement.

Changes

Cohort / File(s) Summary
Configuration & Packaging
apps/dataflow/pyproject.toml, apps/dataflow/requirements.txt
Added dependency pydantic>=2.12.5 and included src/models in wheel distribution; updated requirements manifest.
Models Package
apps/dataflow/src/models/__init__.py, apps/dataflow/src/models/_common.py, apps/dataflow/src/models/buildings.py, apps/dataflow/src/models/floorplans.py, apps/dataflow/src/models/graph.py, apps/dataflow/src/models/placements.py
New package exposing Pydantic schemas: common types (GeoCoordinate, LocalPosition, Floor), Buildings, Floorplans, Graph, and Placements. Models use strict validation (extra="forbid") and RootModel wrappers for JSON roots.
S3 Client Validation & Singletons
apps/dataflow/src/clients/_s3_client.py
Added SCHEMA_VALIDATORS mapping, _get_validator() and _validate_json_data() helpers, JSON pre-upload validation in upload_json_file(), and get_s3_client_singleton() cached factory; docstring and initialization notes added.
API Client Placement
apps/dataflow/src/clients/_api_client.py
Moved get_api_client_singleton() definition to after ApiClient class; behavior/signature unchanged.
Import Pattern Updates
apps/dataflow/src/logger/_utils.py, apps/dataflow/src/deserializer/main.py
Replaced TYPE_CHECKING-guarded imports with unconditional runtime imports from collections.abc (e.g., Callable, Generator) to simplify annotations.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant S3Client
    participant Validator as Pydantic Validator
    participant FileSystem
    participant S3 as AWS S3

    Client->>S3Client: upload_json_file(local_path, s3_name)
    S3Client->>FileSystem: read JSON file
    FileSystem-->>S3Client: json_data
    S3Client->>S3Client: _get_validator(s3_name)
    alt validator found
        S3Client->>Validator: validate(json_data)
        alt validation passes
            Validator-->>S3Client: valid
            S3Client->>S3: upload object
            S3-->>S3Client: upload success
            S3Client-->>Client: True
        else validation fails
            Validator-->>S3Client: ValidationError
            S3Client->>S3Client: log error
            S3Client-->>Client: False
        end
    else no validator
        S3Client->>S3: upload object (no validation)
        S3-->>S3Client: upload success
        S3Client-->>Client: True
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰🌿 I nibble schemas, neat and small,
Pydantic carrots for one and all,
Floors and graphs in tidy rows,
I validate before S3 goes —
Hop, upload, and softly thump for joy! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: introducing Pydantic models for validating S3 uploads, which is the primary focus across multiple modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch s3_typing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@apps/dataflow/src/clients/_s3_client.py`:
- Line 46: In the S3 client constructor (e.g., class S3Client.__init__), remove
the redundant duplicate assignment "self.logger = get_app_logger()" (the second
instance) since self.logger is already set earlier; keep the first
initialization and delete the later one so the logger is only assigned once.

In `@apps/dataflow/src/models/buildings.py`:
- Line 22: The hitbox field on the building model is declared as "hitbox:
list[GeoCoordinate] | None" but has no default, so JSON that omits the key will
fail validation; update the model to provide a default (e.g., set hitbox to None
by default or use a field/default_factory as your model style uses) so omitted
hitbox is treated as None, matching how code and entrances are handled and
ensuring optional input is accepted.
🧹 Nitpick comments (1)
apps/dataflow/src/models/graph.py (1)

17-18: Consider defining a typed model for to_floor_info.

Using dict[str, Any] reduces type safety. If the structure of floor transition info is well-defined, consider creating a dedicated Pydantic model (e.g., FloorTransitionInfo) to enable validation of its contents.

@romycyy romycyy merged commit c68bfe9 into staging Jan 31, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants