[🎥 ICCV2025] ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning
| Jongseo Lee1* | Kyungho Bae2* | Kyle Min3 | Gyeong-Moon Park4† | Jinwoo Choi1† |
|---|
* Equal contribution, † Corresponding author
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
- Accepted at ICCV 2025 (Highlight Presentation)
- Proposes ESSENTIAL, a framework inspired by human memory that integrates episodic and semantic memory for video class-incremental learning (VCIL).
- Achieves a favorable trade-off between memory efficiency and recognition performance compared to prior VCIL methods.
- Code release includes training, evaluation, and visualization tools.
We recommend using conda to create a clean environment.
The code has been tested with Python 3.8, PyTorch 2.0.1, and CUDA 11.7.
conda create -n ESSENTIAL python=3.8 -y
conda activate ESSENTIAL
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txtWe provide the annotation files for ESSENTIAL on Hugging Face Hub.
Please download the annotation files from the link above.
Place the downloaded files under the ./data/ directory as follows:
ESSENTIAL/
│── data/
│ ├── clip_temporal.pth
│ ├── TCD/
│ │ ├── ...
│ │ └──
│ ├── vCLIMB/
│ │ ├── ...
│ │ └──
│ └── …
- The benchmark datasets (e.g., Kinetics-400) should be downloaded separately.
We provide training scripts for two representative benchmarks:
- TCD (Something-Something V2 based):
scripts/ssv2_final.sh - vCLIMB (UCF101 based):
scripts/ucf_final.sh
These scripts contain the recommended hyperparameters and configurations for each benchmark.
For other datasets, please adapt the script by changing the following arguments:
--data_set: the dataset name (e.g.,SSV2,UCF101,Kinetics400)--anno_path: the path to the corresponding annotation file (e.g.,ESSENTIAL/data/TCD/...pkl)--num_tasks: the number of incremental tasks for the experiment
By modifying these options, the same framework can be applied to various datasets under different class-incremental learning scenarios.
To evaluate a trained model, specify the path to the experiment folder using:
--fine_tune_path: path to the folder containing the trained checkpoints
For evaluation-only mode, enable the following flags:
--no_training: disable further training--no_rehearsal: disable rehearsal during evaluation
With these options, ESSENTIAL will load the trained checkpoints and report performance without performing additional training or rehearsal.
For the convenience of follow-up research and reproducibility,
we implemented ESSENTIAL to store tokens in memory during training and evaluation,
instead of directly saving them into files.
This design choice makes it easier for others to adapt the codebase for new experiments and extend it to different research directions.
