This repository contains materials from the Spectroscopy Masterclass tutorial delivered at the CECAM workshop, held at the Rutherford Appleton Laboratory, UK (home of the Diamond Light Source synchrotron).
The tutorial provides a hands-on, end-to-end workflow for going from raw X-ray Absorption Spectroscopy (XAS) data to trustworthy machine learning models.
- Notebooks
- Database
- Data quality checks and preprocessing
- Comparison with different XAS simulation codes
- Generating simulated spectra with FDMNES
- Feature engineering for spectroscopy data
- ML pipelines with CDF+XGBoost, and PCA+MLP/1D-CNN
- Model validation and transfer to experimental spectra
- Hyperparameter tuning
Required
numpymatplotlibscikit-learnscipyxgboost(on macOS also installlibomp; with conda:conda install -c conda-forge libomp xgboost)torch(PyTorch)pymatgen
Optional (for code in the markdown cells)
optunaorscikit-optimize(Bayesian HPO)shap(feature attributions for tree models)