GitHub - KhazenLab/covid19-genomics-scripts

Our comparative genomics work on covid-19.

to complement the testing statistics timeseries (ref repo covid19-testing-data)

How to use

The pipeline can be run as follows:

First set of files to process sequences from GISAID. This ultimately yields a VCF files containing the mutations:

The scripts assume the following files are in the same directory as the code:

Second set of files to calculate annotate with snpEff and prepare for upload to the dashboard

Third script which postprocesses the dashboard file a bit more:

l3b_postprocess_genomics.py: by Halim, used before uploading to arcgis.com
- The jupyter notebook from which this was derived is the l3a file

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
README.md		README.md
l1a_parse_gisaid_fasta.R		l1a_parse_gisaid_fasta.R
l1b_mummer.sh		l1b_mummer.sh
l1c_stats.R		l1c_stats.R
l2a_snpeff.sh		l2a_snpeff.sh
l2b_files4dashboard.R		l2b_files4dashboard.R
l3a_postprocess_genomics.ipynb		l3a_postprocess_genomics.ipynb
l3b_postprocess_genomics.py		l3b_postprocess_genomics.py