Open
Conversation
Adds the Jupyter notebook and corresponding restructured text version of a how-to guide to estimate clade frequencies from SARS-CoV-2 data.
victorlin
reviewed
Apr 13, 2022
Comment on lines
+73
to
+74
| tree_url = "https://data.nextstrain.org/ncov_global.json" | ||
| frequencies_url = "https://data.nextstrain.org/ncov_global_tip-frequencies.json" |
Member
There was a problem hiding this comment.
Suggested change
| tree_url = "https://data.nextstrain.org/ncov_global.json" | |
| frequencies_url = "https://data.nextstrain.org/ncov_global_tip-frequencies.json" | |
| tree_url = "https://data.nextstrain.org/ncov_open_global.json" | |
| frequencies_url = "https://data.nextstrain.org/ncov_open_global_tip-frequencies.json" |
| @@ -0,0 +1,758 @@ | |||
| ======================================================== | |||
| Estimate frequencies of phylogenetic clades through time | |||
Member
There was a problem hiding this comment.
Another potential issue is how we should maintain guides like this that we generate directly from a notebook environment.
What about an extension like nbsphinx or MyST-NB? We'd still have to run the notebook to generate new plots but less manual work than maintaining the same content in both .ipynb and .rst.
Contributor
Author
|
During issue triage we also realized that this guide can be updated to use Nextstrain open data. |
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
Adds the Jupyter notebook and corresponding restructured text version of a how-to guide to estimate clade frequencies from SARS-CoV-2 data.
An open question with this guide (and others like it in the future) is where we should source the data. The benefit to the current approach is that it does not require users to prepare any data in advance; data are fetched from the live Nextstrain builds. The disadvantages of this approach are that the guide's static figures quickly diverge from current data and we don't show users how to load their own local data which may be much more relevant.
Another potential issue is how we should maintain guides like this that we generate directly from a notebook environment. To prepare this guide for the docs, I had to manually copy images into the
imagesdirectory and rename them for clarity. The HTML/CSS presentation of tables is also not ideal. We might want to standardize these steps for future guides, even if the standards are a checklist in the documentation's documentation.Testing
The initial guide was tested by @kistlerk and this version reflects edits based on (most of) her comments. One comment I did not address here was a suggestion to allow users to source their own local data for the guide instead of fetching the live Nextstrain data (see discussion above).
The guide is available through this PR's RTD build.