Trends and Variations in Associations Between Survey-Derived Individual Characteristics and Opioid-Related Adverse Events in Community-Dwelling Ontarians: 2013-2024
This project aims to develop a predictive algorithm for the risk of drug overdose following the prescription of narcotics. It utilizes a survival-analysis approach, drawing on administrative and survey-based predictors from the Canadian Community Health Survey (CCHS).
The project is organized into the following directories:
- Data/: Contains the raw CCHS data for different survey cycles (2013-2018 cycles).
- R/: Houses all the R scripts for data loading, processing, analysis, and utility functions.
- worksheets/: Includes supplementary files like variable lists and details in CSV and Excel formats.
/Users/karimhalal/Desktop/The worlds greatest thesis/Thesis/
├───.gitignore
├───config.yml
├───README.md
├───Thesis.Rproj
├───Data/
│ ├───cchs2013_2014.RData
│ ├───cchs2015_2016.RData
│ └───cchs2017_2018.RData
├───R/
│ ├───dependency_table.R
│ ├───harmonized.R
│ ├───load_dependencies.R
│ ├───loadData.R
│ ├───special_functions.R
│ ├───table-1-a.R
│ └───testing.R
└───worksheets/
├───cchsflow_variables_details1.csv
├───deptable.xlsx
├───masterfilesheet.xlsx
└───od_variables.csv
- .gitignore: Specifies files and directories to be ignored by Git.
- config.yml: The main configuration file that defines paths to data and variable sheets, ensuring a centralized and easily manageable setup.
- README.md: Provides a brief introduction to the project.
- Thesis.Rproj: An RStudio project file that helps in managing the project's context.
This directory stores the CCHS datasets for three different cycles:
- cchs2013_2014.RData
- cchs2015_2016.RData
- cchs2017_2018.RData
This is where the core logic of the project resides.
- load_dependencies.R: Loads all the necessary R packages required for the project.
- loadData.R: The main script for data handling. It reads the configuration from
config.yml, loads the CCHS data, and then harmonizes it using functions from thecchsflowandrecodeflowpackages to create a unified study dataset. - harmonized.R: Contains scripts for creating and manipulating the harmonized dataset.
- special_functions.R: A collection of custom R functions tailored for specific data transformations and derivations needed in the analysis.
- dependency_table.R: A utility script that generates a table of all package dependencies for the project, which is useful for reproducibility.
- table-1-a.R: Generates a summary table (Table 1) of the dataset's characteristics.
- testing.R: Includes experimental or test code, such as a function for imputing single-year age from categorical age data.
This directory contains human-readable files that provide metadata for the analysis.
- od_variables.csv & cchsflow_variables_details1.csv: These files define the variables to be used in the analysis, their roles (e.g., predictor, outcome), and other details.
- deptable.xlsx: An Excel file containing the dependency table generated by
dependency_table.R. - masterfilesheet.xlsx: A master sheet for variables.
The project follows a modular, configuration-driven architecture that promotes clarity and reproducibility.
- Configuration: The
config.ymlfile acts as the single source of truth for file paths and parameters. - Data Loading & Harmonization:
loadData.Rreads the configuration and orchestrates the data loading and harmonization process. It iterates through the specified CCHS datasets, applies transformations usingrecodeflowand custom functions fromspecial_functions.R, and combines them into a singleharmonized_dataframe. - Analysis: Once the data is prepared, other scripts like
table-1-a.Rare used to perform the actual analysis and generate results. - Utilities: Scripts like
dependency_table.Rprovide helpful utilities for managing the project.
This structured approach ensures that the analysis is easy to understand, modify, and reproduce.