In Situ Training of Implicit Neural Compressors for Scientific Simulations via Sketch-Based Regularization
Cooper Simpson, Stephen Becker, Alireza Doostan
Submitted to Journal of Computational Physics
Focusing on implicit neural representations, we present a novel in situ training protocol that employs limited memory buffers of full and sketched data samples, where the sketched data are leveraged to prevent catastrophic forgetting. The theoretical motivation for our use of sketching as a regularizer is presented via a simple Johnson-Lindenstrauss-informed result. While our methods may be of wider interest in the field of continual learning, we specifically target in situ neural compression using implicit neural representation-based hypernetworks. We evaluate our method on a variety of complex simulation data in two and three dimensions, over long time horizons, and across unstructured grids and non-Cartesian geometries. On these tasks, we show strong reconstruction performance at high compression rates. Most importantly, we demonstrate that sketching enables the presented in situ scheme to approximately match the performance of the equivalent offline method.
All source code is made available under an MIT license. You can freely use and modify the code, without warranty, so long as you provide attribution to the authors. See LICENSE for the full text.
Our work can be cited using the following bibtex entry:
@article{simpson2025insitu,
title = {{In Situ Training of Implicit Neural Compressors for Scientific Simulations via Sketch-Based Regularization}},
author = {Simpson, Cooper and Becker, Stephen and Doostan, Alireza},
year = {2025},
journal = {arXiv},
url = {https://arxiv.org/abs/2511.02659}
}core: Model architecture, data loading, utilities, and any core operationsdata: Data foldersexperiments: Experiment configuration filestemplate.yaml: Detailed experiment template
lightning_logs: Experiment logsjob_scripts: SLURM job scriptsrun.py: Model training and testing script
The file environment.yaml contains a list of dependencies, and it can be used to generate an anaconda environment with the following command:
conda env create --file=environment.yaml --name=compressionwhich will install all necessary packages for this template in the conda environment compression.
For local development, and if you want to be able to run the notebooks, it is easiest to install core as a pip package in editable mode using the following command from within the top level of this repository:
pip install -e .Although, the main experiment script can still be run without doing this.
The Ignition and Neuron Transport datasets are not publically available, and the Channel Flow dataset is a trimmed variant of the full version from the JHU Turbulence Database. Please reach out if you would like access or further information.
Use the following command to run an experiment:
python run.py --mode train --config <path to YAML file within ./experiments> --data_dir <path to data>If logger is set to True in the YAML config file, then the results of this experiment will be saved to lightning_logs/<path to YAML file within ./experiments>, and use this command to test a logged run:
python run.py --mode test --config <path to version inside ./lightning_logs> --data_dir <path to data>To visualize the logging results saved to lightning_logs/ using tensorboard run the following command:
tensorboard --logdir=lightning_logs/The Jupyter Notebook Paper Results can be used to generate all paper figures and table details.