Skip to content

Implementation of the AL-RNN architecture for dynamical systems reconstruction and time series forecasting.

License

Notifications You must be signed in to change notification settings

DurstewitzLab/ALRNN-DSR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in DSR [NeurIPS 2024 Poster]

Introduction

This repository provides an implementation of the Almost-Linear RNN (AL-RNN) used for dynamical systems (DS) reconstruction and time series forecasting.

alt text

The AL-RNN automatically and robustly generates piecewise linear representations of DS from time series data and inherently provide a symbolic encoding of the underlying DS, while preserving key topological properties. The AL-RNNs are trained using backpropagation through time with a sparse teacher forcing protocol.

The implentation of the ALRNN is available in Julia and Python.

1. Julia implementation

To install the package, clone the repostiory and cd into the project folder: Install the package in a new Julia environment:

julia> ]
(@v1.10) pkg> activate .
pkg> instantiate

Running the Code

Single Runs

To start a single training, execute the main.jl file, where arguments can be passed via command line. For example, to train a ALRNN with 20 latent dimensions and 2 PWL units for 2000 epochs using 4 threads, while keeping all other training parameters at their default setting, call

$ julia -t4 --project main.jl --model ALRNN --latent_dim 20 --P 2 --epochs 2000

in your terminal of choice (bash/cmd). The default settings can also be adjusted directly; one can then omit passing any arguments at the call site. The arguments are also listed in in the argtable() function.

Multiple Runs + Grid Search

To run multiple trainings in parallel e.g. when grid searching hyperparameters, the ubermain.jl file is used. Currently, one has to adjust arguments which are supposed to differ from the default settings, and arguments that are supposed to be grid searched, in the ubermain function itself. This is as simple as adding an Argument to the ArgVec vector, which is passed the hyperparameter name (e.g. latent_dim), the desired value, and and identifier for discernibility and documentation purposes. If value is a vector of values, grid search for these hyperparameters is triggered.

function ubermain(n_runs::Int)
    # load defaults with correct data types
    defaults = parse_args([], argtable())

    # list arguments here
    args = SymbolicDSR.ArgVec([
        Argument("experiment", "ALRNN_Lorenz63"),
        Argument("model", "ALRNN"),
        Argument("pwl_units", [0,1,2,3,4,5], "P"),
    ])

    [...]
end

This will run a grid search over pwl_units corresponding to the number of PWL units using the ALRNN.

The identifier (e.g. "P" in the snippet above) is only mandatory for arguments subject to grid search. Once Arguments are specified, call the ubermain file with the desired number of parallel worker proccesses (+ amount of threads per worker) and the number of runs per task/setting, e.g.

$ julia -t2 --project ubermain.jl -p 20 -r 10

will queue 10 runs for each setting and use 20 parallel workers with each 2 threads.

Evaluating Models

Evaluating trained model can be done via evaluate.jl. Here the settings and arguments (specified in default settings) need to be provided from a performed experiment (like shown above).

Specifics

Model Architecture

Latent/Dynamics model

  • ALRNN → ALRNN, where pwl_units controls the number of PWL units with
  • Identity mapping → Identity, to generate observations

Data Format

Data for the algorithm is expected to be a single trajectory in form of a $T \times N$ matrix (file format: .npy), where $T$ is the total number of time steps and $N$ is the data dimensionality. Examples are provided.

Training method

The ALRNN is trained by backpropagation through time using sparse teacher forcing. The forcing interval is controlled by teacher_forcing_interval, which specifies the intervals at which the latent state is forced according to the observations in order to prevent exploding/vanishing gradients.

Versions

  • Julia 1.10.1

  • Flux 0.14.16

2. Python Implementation

A basic implementation of the model and training routine is also available in Python (see ALRNN_python folder).

Model and Training routine

The model including its training algorithm is implemented in the → ALRNN_Tutorial notebook. The AL-RNN using Identity mapping to generate observations is defined within the model class AL_RNN. The latent dimension $M$ and number of PWL units $P$ need to be specified. Training is performed through the training_sh routine using backpropagation through time with a sparse teacher forcing protocol, forcing the latent states based on the observations in order to prevent exploding/vanishing gradients.

Dataset

Data for the algorithm is expected to be a single trajectory in form of a $T \times N$ matrix (file format: .npy), where $T$ is the total number of time steps and $N$ is the data dimensionality. To generate a dataset for training, the TimeSeriesDataset class in the → dataset file can be used.

Example Models and Evaluation

Trained example models are provided in ALRNN_models. They can be evaluated using a simple evaluation provided in the → ALRNN_Tutorial notebook. The linear subregion analysis functions are provided in the → linear_region_functions file. To calculate the dynamical system reconstruction quality, the state space distance $D_{stsp}$ and Hellinger dinstance $D_H$ can be calculated to evalute the geometrical and temoral agreement. These functions are provided in → performance measures.

Citation

If you find the repository and/or paper helpful for your own research, please cite our work.

@inproceedings{brenner_almost_2024,
 author = {Brenner, Manuel and Hemmer, Christoph J\"{u}rgen and Monfared, Zahra and Durstewitz, Daniel},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
 pages = {36829--36868},
 publisher = {Curran Associates, Inc.},
 title = {Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction},
 url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/40cf27290cc2bd98a428b567ba25075c-Paper-Conference.pdf},
 volume = {37},
 year = {2024}
}

Acknowledgements

This work was funded by the Federal Ministry of Science, Education, and Culture (MWK) of the state of Baden-Württemberg within the AI Health Innovation Cluster Initiative, by the German Research Foundation (DFG) within Germany’s Excellence Strategy EXC 2181/1 – 390900948 (STRUCTURES), and through DFG individual grant Du 354/15-1 to DD. ZM was funded by the Federal Ministry of Education and Research (BMBF) through project OIDLITDSM, 01IS24061.

About

Implementation of the AL-RNN architecture for dynamical systems reconstruction and time series forecasting.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •