Skip to content

Automation Issues/Improvements (version 2) #29

@KELSEYDOWLING7

Description

@KELSEYDOWLING7

Creating a new issue based on the original issue #28

Now ranked by Priority Level

These reports run locally in R with no issues but have various automation issues in GCP.

Issues

  • Weekly Module Metrics (GCP job ccc-weekly-module-metrics)
  • Jake, Michelle mention you started working on this. It says successful but the report that's posting doesn't match the R script at all. It looks like it's somehow a much, much older version of the report instead of the current code. Unsure how that's possible. (I moved the outputs out of the ALL folder to avoid confusion)
  • Cancer Screening History-- fixed
  • (NEW) Recruitment Custom QC- scheduled job ccc-recruitment-qc-metrics says successful in GCP but doesn't post to Box. It's also taking only one minute to supposedly render in GCP however the data pull is massive and takes typically about 10 minutes in Rstudio, so I have a hard time believing that that is happening correctly
  • (NEW) Biospecimen prof folders (the two remaining csvs that do not have their own scheduled queries) -
    ccc-weekly-biospecimen-metrics-csvs
Image
  • (NEW) Adhoc refusal and withdrawal report: I've added a scheduled job adhoc_refwd_csv and confirmed that adhoc_refwd_csv matches and was added to the api.R file in that repo (https://github.com/Analyticsphere/weekly_ccc_report_gcp_pipeline/tree/main/prod) . There's no clear error in the logs, no red error or yellow warning so I'm confused on where it's failing. Confirmed nothing was uploaded to Box
  • (NEW) Monthly Data Destruction -- This started in February but there's been no code change
Image Image Image Image Image Image
  • RCA Metrics--fixed

Improvements

  • Biospecimen Prod Files (GCP job ccc-weekly-biospecimen-metrics-csvs)
  • The first two files generated in this code (flatBoxes and KitAssembly) are just BQ queries. I'd like to take them out of the R code, and have them run as scheduled queries every Monday at 11:30 with the outputs posted to box (with the file name maintained) to this folder below
  • Code: https://github.com/Analyticsphere/ccc_biospecimen_metrics_gcp_pipeline/blob/main/Weekly%20Biospecimen%20CSV%20Outputs.R
  • Box Folder: https://nih.app.box.com/folder/221280601453
  • Output: Formatted_prod_flatBoxes_{currentDate}boxfolder{boxfolder}.xlsx , Connect_prod_KitAssembly_Table_{currentDate}boxfolder{boxfolder}.xlsx ,
    Connect_prod_recr_veriBiospe_Formats_{currentDate}boxfolder{boxfolder}.xlsx ,
    Connect_prod_Biospe_Formats_{currentDate}boxfolder{boxfolder}.xlsx

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions