-
Notifications
You must be signed in to change notification settings - Fork 9
Add index creation script for onsides PostgreSQL schema #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
wolfderby
wants to merge
17
commits into
tatonetti-lab:main
Choose a base branch
from
dbmi-pitt:load_110425
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…-run/commit switch)
…g DDL (csv columns, types, awk_nl_count removed)
… comments, script, and README (increase length to 32)
…lper; document usage
…d integrity check - load_missing_product_adverse_effect.sh: Stage CSV and insert missing product_adverse_effect rows - load_missing_product_to_rxnorm.sh: Stage CSV and insert missing product_to_rxnorm rows - patch_missing_vocab_rxnorm_from_staging.sh: Insert placeholder RxNorm vocab entries and reinsert mappings - patch_missing_vocab_meddra_from_staging.sh: Insert placeholder MedDRA vocab entries and reinsert adverse effects - integrity_check_and_cleanup.sh: Verify no unmatched staging rows and drop staging tables - Added is_placeholder boolean column to vocab tables and marked placeholders
- For vocab tables with is_placeholder column, count only non-placeholder rows - Checks column existence to avoid errors on tables without it - Fixes csv_count_diff mismatches for vocab_meddra_adverse_effect and vocab_rxnorm_product
… restore script - database/qa/run_qa_bulk.sh: Script to run QA logging for all CSVs in a directory - .github/: GitHub Actions workflows for CI/CD - database/schema/postgres_restore_constraints.sql: Script to restore FK/PK constraints
…n QA script - Explain that wc -l may overcount due to embedded newlines - Clarify that csv_count_diff is the key metric for import accuracy
- Documents wc_l_count, csv_record_count, select_count_on_domain - Explains select_count_diff vs csv_count_diff - Includes examples and troubleshooting tips
- Added about table DDL to postgres.sql with metadata columns - Created populate_about_table.sql to insert version, description, counts, etc. - Created run_populate_about.sh to execute the population - Populated the table in cem_development_2025 with current stats
- Remove defaults for PGHOST, PGPORT, PGUSER, PGDATABASE - Add checks and prompts for each if not set in environment - Password can still be handled via PGPASSWORD or .pgpass - Table names in SQL already prefixed with onsides.
- populate_about_metadata.sql: Inserts row counts and attributes for OnSIDES tables - run_populate_about_metadata.sh: Shell script to execute with prompts for env vars - Populated the table with current stats (row counts, version, data sources)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.