-
Notifications
You must be signed in to change notification settings - Fork 2
1.8.2 #217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.8.2 #217
Conversation
WalkthroughVersion bumped to v1.8.2 across CI/CD, build and packaging files. Data pipeline adds a Series.logo field, changes extractor/load signatures to accept inputs, and improves pricing-name normalization. SQL view and CLI formatting adjusted; new tests added and .gitignore/dev deps updated. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
card_data/pipelines/defs/load/load_data.py (1)
4-8: Remove unused imports that shadow parameter names.After refactoring to Dagster's parameter-based dependency injection, these imports are no longer needed and create naming conflicts. Dagster resolves asset dependencies by matching parameter names to asset names in the graph, not through imports.
Apply this diff to remove the unused imports:
-from ..extract.extract_data import ( - extract_series_data, - extract_set_data, - create_card_dataframe, -)This resolves the F811 redefinition warnings from static analysis.
🧹 Nitpick comments (2)
cmd/card/serieslist.go (1)
66-66: LGTM!The addition of "Sword & Shield" follows the existing pattern correctly and the mapping to "swsh" already exists in
seriesIDMap(line 11), ensuring proper functionality when selected.Optional: Consider chronological ordering.
If list order matters for UX, you might consider arranging series chronologically by release: Mega Evolution → Sword & Shield → Scarlet & Violet. The current order places Sword & Shield (2019) after Scarlet & Violet (2022).
card_data/pipelines/defs/extract/testing.py (1)
6-15: Clarify the filtering criteria in comments.The comment on line 14 states that
sm"Should be filtered out" but attributes this tologo: None. However, examining theextract_series_dataimplementation shows filtering is based onidbeing in["swsh", "sv", "me"], not on logo presence. The item is filtered becauseid="sm"is not in the allowed list.Apply this diff to clarify:
{"id": "xy", "name": "XY", "logo": "https://example.com/xy.png"}, # Should be filtered out {"id": "me", "name": "McDonald's Collection", "logo": "https://example.com/me.png"}, - {"id": "sm", "name": "Sun & Moon", "logo": None}, # Should be filtered out + {"id": "sm", "name": "Sun & Moon", "logo": None}, # Should be filtered out (id not in allowed list)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
card_data/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
.github/workflows/ci.yml(1 hunks).gitignore(1 hunks).goreleaser.yml(1 hunks)Dockerfile(2 hunks)README.md(2 hunks)card_data/pipelines/defs/extract/extract_data.py(8 hunks)card_data/pipelines/defs/extract/extract_pricing_data.py(4 hunks)card_data/pipelines/defs/extract/testing.py(1 hunks)card_data/pipelines/defs/load/load_data.py(3 hunks)card_data/pipelines/poke_cli_dbt/dbt_project.yml(1 hunks)card_data/pipelines/poke_cli_dbt/macros/create_view.sql(1 hunks)card_data/pipelines/soda/checks.yml(1 hunks)card_data/pyproject.toml(2 hunks)cmd/card/cardlist.go(1 hunks)cmd/card/serieslist.go(1 hunks)go.mod(1 hunks)nfpm.yaml(1 hunks)testdata/main_latest_flag.golden(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
card_data/pipelines/defs/extract/testing.py (1)
card_data/pipelines/defs/extract/extract_data.py (1)
extract_series_data(31-49)
card_data/pipelines/defs/load/load_data.py (1)
card_data/pipelines/defs/extract/extract_data.py (3)
extract_series_data(31-49)extract_set_data(53-87)create_card_dataframe(138-191)
🪛 Ruff (0.14.8)
card_data/pipelines/defs/load/load_data.py
21-21: Redefinition of unused extract_series_data from line 5
(F811)
75-75: Redefinition of unused extract_set_data from line 6
(F811)
95-95: Redefinition of unused create_card_dataframe from line 7
(F811)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: tests
🔇 Additional comments (27)
nfpm.yaml (1)
4-4: LGTM!Version bump is consistent with the PR objectives and release cycle.
testdata/main_latest_flag.golden (1)
5-5: LGTM!The golden file correctly displays v1.8.1 as the latest available release. Since this tests fetching the latest release from GitHub, it would show v1.8.1 until v1.8.2 is actually published.
.github/workflows/ci.yml (1)
33-33: LGTM!Version environment variable properly updated to v1.8.2, consistent with the release objectives.
.goreleaser.yml (1)
17-17: LGTM!Version string in ldflags correctly updated to v1.8.2 for binary embedding.
card_data/pipelines/poke_cli_dbt/dbt_project.yml (2)
2-2: LGTM!Version bump is consistent with the release.
21-21: LGTM!Adding explicit table materialization for the poke_cli_dbt models is a good practice that ensures consistent model behavior.
card_data/pyproject.toml (1)
34-35: Development dependencies verified as valid and stable releases.Both pytest 9.0.2 and responses 0.25.8 exist on PyPI and are confirmed stable releases. These versions are the latest in their respective minor version series (9.0.x and 0.25.x).
go.mod (1)
3-3: Go 1.24.11 is a valid official release (released 2025-12-02).No action needed—the version is current and stable according to official Go releases.
.gitignore (1)
66-66: LGTM!The addition of
card_data/storage/to the ignore patterns is appropriate for excluding local storage artifacts from version control.README.md (1)
5-5: LGTM!The Docker image version references are correctly updated to v1.8.2, ensuring consistency between the badge display and usage examples.
Also applies to: 98-98, 102-102
Dockerfile (1)
11-11: LGTM!The version string is correctly updated to v1.8.2, consistent with the PR objectives.
cmd/card/cardlist.go (1)
122-132: LGTM with a note on price semantics.The conditional formatting improves UX by displaying fallback messages when pricing or illustrator data is unavailable. The zero-check for
MarketPriceworks for this context, though it technically conflates "no data" with "zero value". For TCG pricing, this is acceptable since legitimate $0.00 market prices are rare.card_data/pipelines/defs/extract/extract_pricing_data.py (2)
12-49: LGTM!The expanded
SET_PRODUCT_MATCHINGmapping adds new McDonald's Collection sets and improves organization with clear series groupings. The structure is maintainable and aligns with the TCG data model.
164-167: LGTM!The multi-line error message formatting is cleaner and more readable than the previous backslash-continued approach.
card_data/pipelines/poke_cli_dbt/macros/create_view.sql (3)
9-9: LGTM!The addition of the
illustratorcolumn and consistent LPAD formatting for card numbers improve data presentation. The 3-digit padding (e.g., "001") ensures proper string sorting and aligns with TCG display conventions.Also applies to: 12-12, 30-30, 33-33
20-20: LGTM!The
REGEXP_REPLACEpattern removes trailing parenthetical content to normalize card names for matching with pricing data. This aligns with the Python normalization logic in the extraction pipeline and ensures consistent name matching between the cards and pricing tables.
36-39: LGTM!The change from
INNER JOINtoLEFT JOINensures all cards appear in the view even without pricing data, improving data completeness. TheORDER BY c."localId"::integercast ensures proper numeric sorting (1, 2, 3, 10...) rather than lexicographic sorting (1, 10, 2, 3...).The NULL pricing values from the LEFT JOIN are correctly handled by the Go code, which treats zero values as "not available".
card_data/pipelines/defs/extract/testing.py (2)
17-35: LGTM!The test properly mocks the API response, validates the filtering logic, and confirms the DataFrame structure and content. The use of
responses.activateand assertions is appropriate.
1-4: No changes needed to the import statement.The bare import
from extract_data import extract_series_dataworks correctly since bothtesting.pyandextract_data.pyare in the same directory and Python resolves the module successfully. The test collection error is due to missing dependencies (dagster), not the import statement itself.Likely an incorrect or invalid review comment.
card_data/pipelines/defs/extract/extract_data.py (5)
14-17: LGTM!The addition of the
logofield to theSeriesmodel with proper typing (Optional[HttpUrl]) enables logo data to flow through the pipeline. This aligns with the test expectations and enhances the data model.
46-48: LGTM!The updated filtering logic is clear and explicit. The URL reordering and formatting improvements enhance maintainability without changing behavior.
Also applies to: 55-58
147-159: LGTM!The multi-line formatting of the
scalar_keyslist improves readability and makes it easier to add, remove, or review individual keys.
92-92: Verify the scope reduction from multiple sets to single set is intentional.The URL list has been progressively reduced from two sets (sv01, sv02 in the initial commit) to swsh3, and now to only me02. However, the commit message for the current change ("moving testing libraries to dev dependency group") does not explain this URL modification, and the earlier reduction commit message ("updating asset names") similarly does not document the scope change. Clarify whether this single-set configuration is intentional for v1.8.2 or if the URL should still include multiple sets.
119-120: LGTM! Signature refactoring improves composability.The changes to accept data as parameters instead of directly invoking upstream functions improve the pipeline's modularity and testability, aligning with Dagster best practices for explicit asset dependencies. The parameter naming correctly enables Dagster's dependency injection:
extract_card_inforeceivesextract_card_url_from_set_data,create_card_dataframereceivesextract_card_info, andload_card_datareceivescreate_card_dataframe, forming a complete and properly composed pipeline.card_data/pipelines/defs/load/load_data.py (3)
21-25: LGTM! Proper Dagster dependency injection pattern.The refactoring correctly shifts from direct function invocation to parameter-based dependency injection. Dagster automatically resolves the
extract_series_dataparameter by finding the asset with that name in the graph and passing its output DataFrame.
75-79: Same pattern asload_series_data.
95-99: Same pattern asload_series_data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
README.md (1)
195-195: Roadmap checkbox updated correctly; minor indentation style note.The "add sword & shield data" item is correctly marked as completed. However, markdownlint flags the indentation on this line (and its sibling list items under the
cardcommand) as using 4 spaces instead of the preferred 2 spaces per MD007. This is a pre-existing style inconsistency in the nested list. Consider standardizing all nested list indentation to 2 spaces for consistency with linting rules.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
README.md(3 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
README.md
195-195: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
🔇 Additional comments (1)
README.md (1)
5-5: Version bumps look good.Docker image badge and usage examples correctly updated to v1.8.2 to align with the release version.
Also applies to: 98-98, 102-102
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
card_data/pipelines/defs/extract/extract_pricing_data.py (1)
75-100: Previous concern about hyphenated Pokémon names remains unaddressed.The logic at line 78 still splits on the first dash, which will incorrectly truncate legitimate hyphenated Pokémon names like "Porygon-Z" → "Porygon", "Ho-Oh" → "Ho", and "Kommo-o" → "Kommo". The suggested fix from the previous review to check against known variant suffixes before splitting hasn't been applied.
🧹 Nitpick comments (2)
cmd/card/cardlist_test.go (1)
259-317: Excellent test coverage for success and fallback scenarios.This test comprehensively validates:
- URL construction with correct set ID and field selection
- PriceMap and IllustratorMap fallback behavior when data is missing or zero
- ImageMap population
- View rendering with no row selected
Optional: Simplify the view rendering check (lines 312-316).
The conditional logic could be clearer. Consider testing view rendering unconditionally:
- if row := model.Table.SelectedRow(); len(row) == 0 { - if model.View() == "" { - t.Error("model view should render even if no row is selected") - } + // View should render regardless of selection state + if model.View() == "" { + t.Error("model view should render even if no row is selected") }card_data/pipelines/defs/extract/extract_pricing_data.py (1)
111-116: Consider adding HTTP response validation.The API requests don't validate the response status before parsing JSON. If the API returns an error status (4xx/5xx), calling
.json()may raise an unclear exception or return unexpected data.# Fetch product data products_url = f"https://tcgcsv.com/tcgplayer/3/{product_id}/products" - products_data = requests.get(products_url, timeout=30).json() + products_response = requests.get(products_url, timeout=30) + products_response.raise_for_status() + products_data = products_response.json() # Fetch pricing data prices_url = f"https://tcgcsv.com/tcgplayer/3/{product_id}/prices" - prices_data = requests.get(prices_url, timeout=30).json() + prices_response = requests.get(prices_url, timeout=30) + prices_response.raise_for_status() + prices_data = prices_response.json()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.gitignore(1 hunks)card_data/pipelines/defs/extract/extract_pricing_data.py(5 hunks)card_data/pipelines/defs/extract/testing.py(1 hunks)cmd/card/cardlist.go(3 hunks)cmd/card/cardlist_test.go(2 hunks)cmd/card/serieslist_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- card_data/pipelines/defs/extract/testing.py
- .gitignore
- cmd/card/cardlist.go
🧰 Additional context used
🧬 Code graph analysis (2)
cmd/card/cardlist_test.go (1)
cmd/card/cardlist.go (1)
CardsList(102-160)
cmd/card/serieslist_test.go (1)
cmd/card/serieslist.go (1)
SeriesList(62-81)
🔇 Additional comments (8)
cmd/card/cardlist_test.go (4)
4-4: LGTM: Import added for test error handling.The
errorspackage is appropriately imported to support error simulation in the new test stubs.
319-331: LGTM: Fetch error handling validated.The test correctly verifies that network errors are propagated from
getCardDatato the caller.
333-345: LGTM: Invalid JSON handling verified.The test appropriately validates that malformed JSON payloads result in an error.
347-367: LGTM: Empty result handling validated.The test ensures that empty API responses are handled gracefully, with empty maps and a functional view.
card_data/pipelines/defs/extract/extract_pricing_data.py (4)
1-3: LGTM!The new imports for
reandunicodedataare appropriately added for the regex and Unicode normalization operations inextract_card_name.
12-49: LGTM!Good organization with section comments and the addition of new set identifiers (me01, me02).
129-136: LGTM!Good addition to filter out ball pattern variants specific to Prismatic Evolutions. The logic is clear and the variable is properly reused for both the skip check and name extraction.
166-171: LGTM!Good improvement switching to parenthesized multi-line string for better readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
card_data/pipelines/tests/extract_series_test.py (2)
1-4: Consider using proper package discovery instead of sys.path manipulation.While this works, manipulating
sys.pathis fragile and can cause issues with tooling. Consider alternatives:
- Install the package in development mode (
pip install -e .)- Configure pytest's
pythonpathinpytest.iniorpyproject.toml- Use relative imports if the test structure allows
22-40: Consider adding error case tests.The test validates the happy path effectively. To improve coverage, consider adding tests for:
- API failures (non-200 status codes, network errors)
- Pydantic validation failures (malformed data, invalid URLs)
- Empty API responses
These would help ensure robustness when the external API behaves unexpectedly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
card_data/pipelines/tests/extract_series_test.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
card_data/pipelines/tests/extract_series_test.py (2)
card_data/pipelines/definitions.py (1)
defs(12-15)card_data/pipelines/defs/extract/extract_data.py (1)
extract_series_data(31-49)
🔇 Additional comments (1)
card_data/pipelines/tests/extract_series_test.py (1)
36-40: The aikido bot flags are false positives for test code.The security scanner flagged these
assertstatements as dangerous in optimized mode. However, these are pytest test assertions, which is the standard and recommended approach for writing tests in Python. Tests are not executed in production withPYTHONOPTIMIZEenabled, so this concern doesn't apply here.No action needed—this is idiomatic pytest code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
card_data/pipelines/tests/extract_series_test.py (2)
1-10: Avoid per‑filesys.pathmutation for importsMutating
sys.pathinside individual tests is brittle and can mask packaging/import issues. Consider instead importing via the canonical package path (e.g., from the project root package) or configuring this once in a shared test helper (likeconftest.py) or via an editable install, so tests don’t need to tweaksys.paththemselves.
22-40: Test covers core behavior; optional tightening of assertionsThe test verifies type, row count, allowed ids, and presence of key columns, which is sufficient to guard the current behavior. If you want to harden it further, you could also assert that no unexpected extra rows/columns are present or that the order of ids is deterministic, but that’s optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
card_data/pipelines/tests/extract_series_test.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
card_data/pipelines/tests/extract_series_test.py (2)
card_data/pipelines/definitions.py (1)
defs(12-15)card_data/pipelines/defs/extract/extract_data.py (1)
extract_series_data(31-49)
🔇 Additional comments (1)
card_data/pipelines/tests/extract_series_test.py (1)
11-20: Fixture data nicely exercises both kept and filtered seriesThe fixture mixes ids that should be kept (
swsh,sv,me) with others that should be filtered (xy,sm), which aligns well with the filtering logic inextract_series_dataand keeps the test data realistic.
Summary by CodeRabbit
New Features
Improvements
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.