Code for the contribution of Diffused Burgers to the Kaggle challenge Classification of lymphocytosis from white blood cells (see https://www.kaggle.com/competitions/dlmi-lymphocytosis-classification).
- Install Python 3.11
- Install the package
pip install -e . - Unzip the raw data in
data/raw
The main function is in src/dlmi/__init__.py
- Start mlflow: run the command
mlflow_run(you can access the web app athttp://localhost:5001/) - Launch the program (training+inference on test) from the main folder of the project:
dlmi_train- You can specify the config file
dlmi_train --config-name [train_mlp,train_moe]: some examples are insrc/dlmi/conf
- You can specify the config file
- Results are in the
outputsfolder, the prediction for the test set is insubmission_test_final.csv
Tested on Windows + Nvidia RTX 4080 12GB and Linux + Nvidia V100S 32GB. Training time of less than 30 minutes on V100S.
- Simple MLP on clinical attributes
- Mixture Of Experts MLP on clinical attributes + CNN on images
See report and leaderboard (#2 ex-aequo in the private leaderboard).
MiniDatasethas not been updated to the latest version of the code (with Stratified K-Fold). Please use onlyMILDatasetin the configs.ResNetis the default backbone in the code. Switching toViTcan be done by uncommenting code insrc/dlmi/models/moe_model.py.ViTwas more computationnally-intensive and did not provide better results, hence our choice.