Skip to content

Add guide on how to estimate clade frequencies#53

Open
huddlej wants to merge 1 commit intomasterfrom
document-frequencies
Open

Add guide on how to estimate clade frequencies#53
huddlej wants to merge 1 commit intomasterfrom
document-frequencies

Conversation

@huddlej
Copy link
Contributor

@huddlej huddlej commented Mar 15, 2021

Description of proposed changes

Adds the Jupyter notebook and corresponding restructured text version of a how-to guide to estimate clade frequencies from SARS-CoV-2 data.

An open question with this guide (and others like it in the future) is where we should source the data. The benefit to the current approach is that it does not require users to prepare any data in advance; data are fetched from the live Nextstrain builds. The disadvantages of this approach are that the guide's static figures quickly diverge from current data and we don't show users how to load their own local data which may be much more relevant.

Another potential issue is how we should maintain guides like this that we generate directly from a notebook environment. To prepare this guide for the docs, I had to manually copy images into the images directory and rename them for clarity. The HTML/CSS presentation of tables is also not ideal. We might want to standardize these steps for future guides, even if the standards are a checklist in the documentation's documentation.

Testing

The initial guide was tested by @kistlerk and this version reflects edits based on (most of) her comments. One comment I did not address here was a suggestion to allow users to source their own local data for the guide instead of fetching the live Nextstrain data (see discussion above).

The guide is available through this PR's RTD build.

Adds the Jupyter notebook and corresponding restructured text version of
a how-to guide to estimate clade frequencies from SARS-CoV-2 data.
@huddlej huddlej requested review from jameshadfield and trvrb March 15, 2021 19:16
Comment on lines +73 to +74
tree_url = "https://data.nextstrain.org/ncov_global.json"
frequencies_url = "https://data.nextstrain.org/ncov_global_tip-frequencies.json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tree_url = "https://data.nextstrain.org/ncov_global.json"
frequencies_url = "https://data.nextstrain.org/ncov_global_tip-frequencies.json"
tree_url = "https://data.nextstrain.org/ncov_open_global.json"
frequencies_url = "https://data.nextstrain.org/ncov_open_global_tip-frequencies.json"

@@ -0,0 +1,758 @@
========================================================
Estimate frequencies of phylogenetic clades through time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another potential issue is how we should maintain guides like this that we generate directly from a notebook environment.

What about an extension like nbsphinx or MyST-NB? We'd still have to run the notebook to generate new plots but less manual work than maintaining the same content in both .ipynb and .rst.

@huddlej
Copy link
Contributor Author

huddlej commented Apr 27, 2022

During issue triage we also realized that this guide can be updated to use Nextstrain open data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Backlog

Development

Successfully merging this pull request may close these issues.

2 participants

Comments