Skip to content

hookcpt/Advanced-predictive-Modelling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Logistic Regression

The core of the script is the development of a logistic regression model. Logistic regression is used when the dependent variable is binary; in this case, whether an infant's birth weight is classified as low (1) or not low (0). Unlike linear regression, which predicts a continuous outcome, logistic regression predicts the probability of an instance (e.g., an infant) belonging to a particular category (e.g., LBW or not).

Model Prediction: The logistic regression model effectively predicts the probability of an event (low birth weight) occurring. It uses the logistic function (also known as the sigmoid function) to model this probability, which ranges between 0 and 1. The logistic function ensures that the model's output can be interpreted as a probability, despite potentially having linear predictors (independent variables) that range across all real numbers.

Independent Variables

The model incorporates various independent variables (predictors) such as age of the mother (AGE), weight at the last menstrual period (LWT), race (RACE), smoking status during pregnancy (SMOKE), history of premature labor (PTL), history of hypertension (HT), presence of uterine irritability (UI), and number of physician visits during the first trimester (FTV). These variables are selected based on prior research or hypotheses regarding factors that might influence birth weight.

Model Coefficients and Interpretation

  • Coefficients: The model estimates coefficients for each independent variable, which quantify the relationship between each predictor and the log odds of the outcome (LBW). A positive coefficient indicates that as the predictor increases, the log odds of LBW (and thus the probability of LBW) increase, holding all other predictors constant. Conversely, a negative coefficient suggests a decrease in the probability of LBW as the predictor increases.
  • Significance: Statistical tests determine the significance of each coefficient, helping to identify which variables are strong predictors of LBW. This is crucial for understanding risk factors and for informing medical and public health interventions.

Classification Thresholds and Evaluation

The script further evaluates the model by applying different classification thresholds (cut-off values) to the predicted probabilities. This process converts probabilities into binary predictions (LBW or not), which are then compared to the actual outcomes to assess the model's performance.

  • Sensitivity and Specificity: These metrics evaluate the model's ability to correctly identify LBW cases (sensitivity) and its ability to correctly identify non-LBW cases (specificity). The choice of threshold affects these metrics, reflecting a trade-off between them.
  • Misclassification Rate: This metric provides an overall measure of the model's accuracy, accounting for both false positives and false negatives.

ROC Curve and AUC

Finally, the script generates a Receiver Operating Characteristic (ROC) curve and calculates the Area Under the Curve (AUC). The ROC curve visualizes the trade-off between sensitivity and specificity at various thresholds, while the AUC provides a single metric summarizing the model's ability to discriminate between LBW and non-LBW cases across all possible thresholds.

  • ROC Curve: Plots sensitivity (true positive rate) against 1-specificity (false positive rate) at various threshold settings.
  • AUC: A higher AUC indicates better model performance, with a score of 1 representing perfect discrimination and a score of 0.5 indicating no better than random guessing.

In Summary: The logistic regression model built in the script aims to understand and predict the risk of low birth weight based on several predictors. By evaluating the model's performance and identifying significant predictors, researchers and healthcare professionals can gain insights into the factors influencing LBW, guiding preventive measures and interventions to mitigate this risk.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages