Paper Link: Preprint
This repository contains the description of 12 ERP datasets and the code of the 15 methods for the paper Benchmarking ERP Analysis: Manual Features, Deep Learning, and Foundation Models. In this paper, we conduct a comprehensive benchmark study that systematically compares traditional manual features (followed by a linear classifier), deep learning models, and pre-trained EEG foundation models for ERP analysis. We establish a unified data preprocessing and training pipeline and evaluate these approaches on two representative tasks, ERP stimulus classification and ERP-based brain disease detection, across 12 publicly available datasets. Furthermore, we investigate various patch-embedding strategies within advanced Transformer architectures to identify embedding designs that better suit ERP data. Our goal is to provide a landmark framework to guide method selection and tailored model design for future ERP analysis.
Input raw EEG data are preprocessed with a unified pipeline to get ERP trials,
including removal of non-EEG channels, notch and band-pass filtering, bad channel interpolation,
average re-referencing, artifact removal, resampling, baseline correction,
trial epoching, and Z-score normalization.
The processed ERP trials are loaded and passed to various models for training and classification,
including manual feature extraction, supervised deep learning trained from scratch,
and foundation models with pre-trained weights.
We benchmark 15 representative methods for ERP analysis, including 2 manual feature-based methods, 10 deep learning methods trained from scratch, and 3 foundation models with pre-trained weights. Two manual feature-based methods are implemented from scratch. Here are the raw code links for each deep learning and foundation model method: TCN, ModernTCN, TimesNet, PatchTST, iTransformer, Medformer, MedGNN, EEGNet, EEGInception, EEGConformer, BIOT, LaBraM, CBraMod. We modified their raw code to adapt to our ERP data and benchmark settings. All the methods are trained under the same data preprocessing and training pipelines for fair comparison.
The pre-trained weights for BIOT can be downloaded here
and should be put at path checkpoints/BIOT/pretrain_biot/BIOT/EEG-PREST-16-channels.ckpt.
The pre-trained weights for LaBraM can be downloaded here
and should be put at path checkpoints/LaBraM/pretrain_labram/LaBraM/labram-base.pth.
The pre-trained weight for CBraMod can be downloaded here
and should be put at path checkpoints/CBraMod/pretrain_cbramod/CBraMod/pretrained_weights.pth.
In light of the limited availability of public ERP datasets,
we systematically searched all accessible resources known to the authors,
including platforms such as OpenNeuro, FigShare, and PhysioNet.
We identified 12 public datasets, each with a sufficient sample size (40+ subjects) to ensure robust statistical evaluation
They are
CESCA-AODD,
CESCA-VODD,
CESCA--FLANKER,
mTBI-ODD,
NSERP-MSIT,
NSERP-ODD,
PD-SIM,
PD-ODD,
ADHD-WMRI,
SCPD,
RLPD,
AOPD.
Note that CESCA-AODD, CESCA-VODD, and CESCA-FLANKER are three sub-datasets of the same dataset CESCA,
and NSERP-MSIT and NSERP-ODD are two sub-datasets of the same dataset NSERP.
Preprocessing files for each dataset are provided in data_preprocessing/ folder with readme files for instructions.
-
Preprocessing Raw Data. Download the raw data from the links above in Data Selection and run notebooks in the folder
data_preprocessing/for each raw dataset to get the processed dataset. Remember to change the root path of raw data in each notebook before running if necessary. -
Dataset Statistics. The statistics of the processed datasets are shown in the figure above. Baseline shows the time duration (in seconds) before stimulus onset used for baseline correction. Epoch shows the time duration (in seconds) before and after stimulus onset used for trial epoching.
-
Processed Dataset Folder Paths. The folder for processed datasets has two directories:
Feature/andLabel/. The folderFeature/contains files named in the formatfeature_ID.npyfiles for all the subjects, where ID is the subject ID. Eachfeature_ID.npyfile contains ERP trials belonging to the same subject and stacked into a 3-D array with shape [N, T, C], N denotes the total trials for a subject, T denotes the total timestamps for a trial, and C denotes the number of channels. For each dataset, the T can be calculated as: epoch length (in seconds) * sampling rate (in Hz), where sampling rate is all resampled to 200Hz. The folderLabel/contains files named in the formatlabel_ID.npyfiles for all the subjects, where ID is the subject ID. Eachlabel_ID.npyfile is a 2-D array with shape [N, X], N denotes the total trials for a subject, same as feature file, each column in X denotes a label type. The details of label type for each dataset can be found in the readme files indata_preprocessing/folder. The label type used for this paper are specified in Table 1 in figure above at classes column. The processed data should be put intodataset/200Hz/DATA_NAME/so that each subject file can be located bydataset/200Hz/DATA_NAME/Feature/feature_ID.npy, and each label file can be located bydataset/200Hz/DATA_NAME/Label/label_ID.npy. -
Processed Datasets Download link. For user's convenience, we also provide the processed datasets. The processed datasets can be manually downloaded at the following link: https://drive.google.com/drive/folders/1pVUmPlsQN9j5HD5YJSeiDrBAUAKBCQA5?usp=drive_link
The recommended requirements are specified as follows:
- Python 3.10
- Jupyter Notebook
- einops==0.4.0
- matplotlib==3.7.0
- numpy==1.23.5
- pandas==1.5.3
- patool==1.12
- reformer-pytorch==1.4.4
- scikit-learn==1.2.2
- scipy==1.10.1
- sympy==1.13.1
- torch==2.5.1+cu121
- tqdm==4.64.1
- natsort~=8.4.0
- mne==1.9.0
- mne-icalabel==0.7.0
- h5py==3.13.0
- pyedflib==0.1.40
- linear_attention_transformer==0.19.1
- timm~=0.6.13
- transformers~=4.57.1
The dependencies can be installed by:
pip install -r requirements.txtBefore running, make sure you have all the processed datasets put under dataset/.
You can see the scripts in scripts/ as a reference.
You could also run all the experiments by running meta script like meta_run_dl_methods file.
Make sure you have the pre-trained weights downloaded and put in proper paths before running foundation model BIOT, LaBraM, and CBraMod.
The gpu device ids can be specified by setting command line --devices (e,g, --devices 0,1,2,3).
You also need to change the visible gpu devices in script file by setting export CUDA_VISIBLE_DEVICES (e,g, export CUDA_VISIBLE_DEVICES=0,1,2,3).
The gpus specified by commend line should be a subset of visible gpus.
Given the parser argument --method,--task_name, --model, and --model_id in run.py,
the saved model can be found incheckpoints/method/task_name/model/model_id/;
and the results can be found in results/method/task_name/model/model_id/.
The default iteration number is set to 5 by command line --itr 5,
and run from random seed 41 to seed 45.
There will be an average and std result at the end of the result file among 5 runs.
You can modify other parameters by changing the command line.
The meaning and explanation of each parameter in command line can be found in run.py file.
Sample running script:
python -u run.py --method Medformer --task_name supervised --is_training 1 --root_path ./dataset/200Hz/ --model_id S-PD-SIM --model Medformer --data MultiDatasets --training_datasets PD-SIM --testing_datasets PD-SIM --e_layers 6 --batch_size 128 --n_heads 8 --d_model 128 --d_ff 256 --patch_len_list 25,50,100 --use_subject_vote --swa --des 'Exp' --itr 5 --learning_rate 0.0001 --train_epochs 200 --patience 15If you find this repo useful, please star our project and cite our paper.
@article{wang2026benchmarking,
title={Benchmarking ERP Analysis: Manual Features, Deep Learning, and Foundation Models},
author={Wang, Yihe and Kang, Zhiqiao and Chen, Bohan and Zhang, Yu and Zhang, Xiang},
journal={arXiv preprint arXiv:2601.00573},
year={2026}
}
We want to thank the authors of the datasets used in this paper for generously sharing their data. Their efforts and contributions have been invaluable in advancing the field of EEG and ERP.
