This study compares effectiveness of the Oxford-AstraZenenca vaccine versus the Pfizer-BNT vaccine in vaccinated health and social care workers.
-
If you are interested in how we defined our codelists, look in the
codelists/directory. -
Analysis scripts are in the
analysis/directory.- The instructions used to extract data from the OpensAFELY-TPP database is specified in the study definition; this is written in Python, but non-programmers should be able to understand what is going on there
- The
lib/directory contains useful functions and look-up tables. - The remaining folders mostly contain the R scripts that process, describe, and analyse the extracted database data.
-
Non-disclosive model outputs, including tables, figures, etc, are in the
released_outputs/directory. -
The
project.yamldefines run-order and dependencies for all the analysis scripts. This file should not be edited directly. To make changes to the yaml, edit and run thecreate-project.Rscript instead.
Scripts are organised into five directories:
-
data_makedummy.Rcontains the script used to generate dummy data. This is used instead of the usual dummy data specified in the study definition, because it is then possible to impose some more useful structure in the data, such as ensuring nobody has a first dose of both the Pfizer and Astra-Zeneca vaccines. If the study definition is updated, this script must also be updated to ensure variable names and types match.
-
design.Rdefines some common design elements used throughout the study, such as follow-up dates, model outcomes, and covariates.data_process.Rimports the extracted database data (or dummy data), standardises some variables and derives some new ones.data_selection.Rfilters out participants who should not be included in the main analysis, and creates a small table used for the inclusion/exclusion flowchartdata_properties.Rtabulates and summarises the variables in the processed data for inspection / sense-checking.
-
table1.Rcreates a "table 1" table describing cohort characteristics at baseline, stratified by vaccine type.table1_allvax.Rsame astable1.r, but on the pre-exclusion cohort.table_irr.Rcalculates unadjusted incidence rates for various outcomes, stratified by vaccine type and times since vaccination.km.Runadjusted Kaplan-Meier plots for outcomes, by vaccine type.seconddose.Rcumulative incidence of second dose coverage, by vaccine type.vaxdate.Rcumulative coverage of first vaccine dose over calendar time.
-
-
model_plr.Rfits the pooled logistic regression models. This script takes four arguments:outcome, for examplepostestfor positive SARS-CoV-2 test orcovidadmittedCOVID-19 hospitalisationtimescale, eithercalendarfor calendar-time ortimesincevaxfor vaccination-timecensor_seconddose, whether (1) or not (0) to censor follow-up at the second dosesamplesize_nonoutcomes_n, to reduce computations time, the size of the sample for those who did not experience the outcome of interest. All those who experienced the outcome were included.
-
report_plr.Routputs summary information, effect estimates, and marginalised cumulative incidence estimates for the pooled logistic regression models frommodel_plr.R. This script has theoutcome,timescale, andcensor_seconddosearguments of themodel_plr.Rscript to pick up the correct models. -
model_cox.Rfits Cox models with time-varying effects, which were used to check consistency with the pooled logistic regression models. -
report_cox.Routputs summary information for the Cox models frommodel_cox.R.
-
-
report_objects.Rcollates some of the baseline data and model outputs and saves to file, to make it easier to incorporate outputs across different actions in the main R markdown script that generates the study manuscript.effectiveness_report.rmdis a R markdown file that puts a lot of the outputs together in one file for easy checking distribution. A pre-cursor to the manuscript.effectiveness_report_comparemodels.rmdmakes it easy to compare Cox versus PLR models, and calendar-time or vaccination-time timescales.
Materials for the manuscript are in the manuscript/ directory. This includes a bibliography, author list, citation style, the Rmarkdown document where the manuscript is authored, and rendered copies of the latest version of the manuscript itself.
Figures, tables, and inline numbers in the Rmarkdown manuscript are taken from non-disclosive, released materials from the OpenSAFELY platform.
The OpenSAFELY framework is a secure analytics platform for electronic health records research in the NHS.
Instead of requesting access for slices of patient data and transporting them elsewhere for analysis, the framework supports developing analytics against dummy data, and then running against the real data within the same infrastructure that the data is stored. Read more at OpenSAFELY.org.
Developers and epidemiologists interested in the framework should review the OpenSAFELY documentation