Releases: EttoreRocchi/ResPredAI
Releases · EttoreRocchi/ResPredAI
v1.6.1
Fixed
trainsubcommand now applies probability calibration (CalibratedClassifierCV) whencalibrate_probabilities = truetrainsubcommand now supports CV threshold method (TunedThresholdClassifierCV) in addition to OOF- Reproducibility manifest now includes probability calibration parameters
Changed
create-configtemplate now includesthreshold_objective,vme_cost,me_costparametersvalidate-configsummary table now displays probability calibration and threshold objective details
v1.6.0
Added
-
Probability Calibration:
- Optional post-hoc probability calibration on the best estimator per outer CV fold
- Supports
sigmoid(Platt scaling) andisotoniccalibration methods - Applied after hyper-parameters tuning and before threshold tuning
-
Calibration Diagnostics:
- Brier Score: Mean squared error of probability predictions (lower is better)
- ECE (Expected Calibration Error): Weighted average of calibration error across bins
- MCE (Maximum Calibration Error): Maximum calibration error across any bin
- Reliability curves (calibration plots) per outer CV fold and aggregate
-
Repeated Stratified Cross-Validation:
outer_cv_repeatsconfig option (default:1)- Set
>1for repeated CV with different shuffles for more robust performance estimates
Changed
metric_dict()now includes Brier Score, ECE, and MCE by default- HTML report includes new calibration diagnostics section
- Output folder now includes
calibration/directory with reliability curve images
v1.5.1
Added
- OneHotEncoder
min_frequencyparameter to reduce noise from rare categorical values
Changed
- Updated
requirements.txtwith explicit version constraints for all dependenciesscikit-learn>=1.5.0required forTunedThresholdClassifierCV
v1.5.0
Added
- VME/ME report:
- VME (Very Major Error): Predicted susceptible when actually resistant
- ME (Major Error): Predicted resistant when actually susceptible
- Flexible threshold objectives:
threshold_objectiveconfig option:youden(default),f1,f2,cost_sensitive- Cost-sensitive optimization with configurable
vme_costandme_costweights
- Per-prediction uncertainty quantification to flag uncertain predictions near decision threshold
uncertainty_marginconfig option (default: 0.1) defines margin around threshold- Predictions within margin are flagged as uncertain in evaluation output
- Uncertainty scores (0-1) provided for each prediction
- Reproducibility manifest (
reproducibility.json) generated withrunandtraincommands with environment info, data fingerprint, full configuration settings
Changed
- HTML report framework summary now displays threshold objective and cost weights (when applicable)
- Evaluation output now includes
uncertaintyandis_uncertaincolumns
v1.4.1
Changed
- Migrated documentation from MkDocs to Sphinx
- Documentation dependencies now loaded dynamically from
docs-requirements.txt - Development dependencies now loaded dynamically from
dev-requirements.txt
v1.4.0
Added
- K-Nearest Neighbors (KNN) classifier support
- Missing data imputation with configurable methods:
SimpleImputer(mean,median,most_frequentstrategies)KNNImputerfor k-nearest neighbors imputationIterativeImputerwithBayesianRidgeorRandomForestestimator
- Comprehensive HTML report generation with metadata run and framework summary tables, results tables with 95% confidence intervals and confusion matrices
- Ruff linter integration in CI workflow for code quality
Changed
- Bootstrap confidence intervals now use sample-level predictions instead of fold-level metrics for more reliable statistical inference
- Updated CI workflow to include lint checks before tests
- Added Python 3.13 to CI test matrix
v1.3.1
Changed
- Reorganized package structure into sub-packages for clarity:
respredai/core/- Pipeline, metrics, models, and ML utilitiesrespredai/io/- Configuration and data handlingrespredai/visualization/- Plotting and visualization
Documentation
- Created
docs/structure with MkDocs
v1.3.0
Added
traincommand for model training on entire dataset (cross-dataset validation)evaluatecommand to apply trained models to new data- Automatic summary report after
runcommand - SHAP-based feature importance as fallback for models without native importance
- Supports MLP, RBF_SVC, and TabPFN via KernelExplainer
--seedflag for reproducible SHAP computations
Documentation
- Added
docs/train-command.md - Added
docs/evaluate-command.md - Updated
docs/feature-importance-command.mdwith SHAP fallback details
v1.2.0
Added
validate-configcommand to validate configuration files without running the pipeline- Optional
--check-dataflag to also verify data file existence and column validity
- Optional
- CLI override options for the
runcommand:--models,--targets,--output,--seed - CONTRIBUTING.md with development setup guide and contribution workflow
Changed
- Bootstrap confidence intervals in metrics output
- User-friendly error messages for missing config files or data paths
Documentation
- Added
docs/validate-config-command.md - Updated
docs/run-command.mdwith CLI overrides section
v1.1.0
Added
- Threshold calibration with dual methods (OOF and CV) using Youden's J statistic
- OOF method: Global optimization on concatenated out-of-fold predictions
- CV method: Per-fold optimization with threshold averaging
- Auto selection based on dataset size (n < 1000: OOF, otherwise: CV)
- Grouped cross-validation (
StratifiedGroupKFold) to prevent data leakage in clinical datasets
Changed
- Expanded hyperparameter grids
- Enhanced CLI information display
Fixed
- XGBoost feature naming issue with special characters
- Color scheme in feature importance plots
Documentation
- Added comprehensive command documentation
docs/run-command.mddocs/create-config-command.mddocs/feature-importance-command.md
- Updated README with logo, quick start guide, and output structure