SYNthesizable APproach for Scalable neural nEtworks
An experimental framework to convert quantized ONNX (QONNX) neural networks into synthesizable VHDL, with complete tooling for ROM generation, testbench creation, stimuli preparation, and cycle-accurate hardware simulation.
Developed by: Institute of Embedded Systems, Zurich University of Applied Sciences (ZHAW)
For recent news check out our Blog.
Part of: REBECCA Project - This work was developed as part of the REBECCA project (Reconfigurable Heterogeneous Highly Parallel Processing Platform for Safe and Secure AI), funded by the European Union under Grant Agreement No. 101097224.
- Overview
- Features
- Project Structure
- Prerequisites
- Installation
- Quick Start
- Core Components
- File Formats & Configuration
- Simulation & Waveform Tips
- Validation
- Developer Workflow
- Troubleshooting
- Contributing
- Limitations
- References
This framework bridges the gap between trained quantized neural networks and FPGA implementations by:
- Converting QONNX models to synthesizable VHDL
- Generating memory initialization files (ROMs) for weights and biases
- Creating testbenches and stimuli for verification
- Providing cycle-accurate simulation capabilities
Supported layer types: Dense, 1D Convolution, MaxPooling, AveragePooling, Argmax, Thresholding
- QONNX to VHDL conversion with automatic network generation
- Multiple layer architectures: Dense, Conv1D, pooling, argmax, multi-threshold
- Automatic ROM generation for weights, biases, and thresholds
- Testbench generation with customizable stimuli
- Audio preprocessing pipeline (MFCC extraction for keyword spotting)
- Cycle-accurate simulation support for ModelSim/Questa
- Synthesizable VHDL validated on Xilinx Zynq Ultrascale+ (ZU9EG)
synapse/
├── source/ # VHDL source files (see source/README.md)
│ ├── *.vhd # Layer implementations, network templates
│ └── wb_rom_*.vhd # Generated weight/bias ROMs
├── simulation/ # Testbenches and simulation scripts
│ ├── scripts/ # ModelSim/Questa DO files
│ ├── data/ # Test data and results
│ └── *_tb.vhd # VHDL testbenches
├── images/ # Architecture diagrams
│
├── Python Tooling:
├── QONNX_Network_gen.py # Main network generator
├── convert_qonnx.py # QONNX model converter
├── prepare_stim.py # Stimuli preparation utilities
├── helper.py # Utility functions
├── string_templates.py # VHDL code generation templates
├── usage_example.py # Complete usage example
│
├── Configuration:
├── .editorconfig # Editor settings (indentation, encoding)
├── .prettierrc # Prettier formatter config
├── .dir-locals.el # Emacs VHDL formatting config
├── Makefile # Build and formatting targets
└── pyproject.toml # Python dependencies & Ruff config
Required:
- Python ~3.12
- ModelSim or Questa (for simulation)
- FINN dependencies (see FINN installation guide)
Optional:
- Xilinx Vivado (for FPGA synthesis)
- GSCV2 dataset (for keyword spotting validation)
-
Clone the repository:
git clone <repository-url> cd synapse
-
Install Python dependencies:
# Using pip (with dev tools) pip install -e ".[dev]" # Or using uv (recommended) uv pip install -e ".[dev]"
-
Install FINN:
Follow the FINN installation guide.
Note: FINN requires Docker for full functionality.
-
Verify installation:
python -c "import qonnx, onnx, numpy; print('Dependencies OK')"
Here's a minimal example to convert a QONNX model and generate VHDL:
from QONNX_Network_gen import NetworkGenerator
from convert_qonnx import ConvertQONNX
from prepare_stim import StimuliPrep
import os
PATH = os.path.dirname(os.path.realpath(__file__))
# 1. Convert and clean QONNX model
conv = ConvertQONNX(PATH + "/your_model.onnx")
conv.cleanup_qonnx()
# 2. Generate VHDL network and ROMs
net_gen = NetworkGenerator("config.json", model_path=PATH + "/network.onnx")
net_gen.run()
# 3. Prepare stimuli (for audio models)
stim = StimuliPrep(
model_path=PATH + "/network.onnx",
config_file=PATH + "/config.json",
n_samples=10
)
stim.prepare_all_audio_samples()
stim.create_data_file()See usage_example.py for a complete working example.
To run simulation:
cd simulation/scripts
# Run in ModelSim/Questa:
do sim_qonnxtest.doconvert_qonnx.py - QONNX Model Converter
- Class:
ConvertQONNX - Key method:
cleanup_qonnx()- Runs FINN/ONNX transformations to prepare models
QONNX_Network_gen.py - Network Generator
- Class:
NetworkGenerator - Main workflow:
run()- Executes complete generation pipeline - Key methods:
create_config()- Parses QONNX and creates configurationgenerate_roms()- Generates weight/bias ROM filesgenerate_network()- Creates VHDL network from templatesgenerate_tb()- Generates testbench files
prepare_stim.py - Stimuli Preparation
- Class:
StimuliPrep py_speech_preprocessing()- MFCC extraction for audiocreate_data_file()- Generates test stimuli files
string_templates.py - VHDL Templates
- Contains string templates for VHDL code generation
- Templates:
PT_TEMPLATE,KS_TEMPLATE, ROM templates
helper.py - Utility Functions
- Binary file operations, data structure helpers
For detailed information about VHDL layer architectures, hardware implementation details, timing diagrams, and delay calculations, see source/README.md.
Topics covered in source/README.md:
- Layer architectures (Dense, Convolutional, Threshold, Argmax)
- Ping-pong buffers and data fetchers
- Hardware timing and delay calculations
- Network controller design
- ROM generation details
Testbenches: simulation/*_tb.vhd
Scripts: simulation/scripts/*.do - ModelSim/Questa simulation scripts
Data: simulation/data/ - Test inputs and results
Results: simulation/data/results.csv - Simulation outputs
Configuration file: config_example.json
The network configuration is a JSON file describing the network architecture:
{
"name": "network_name",
"inputs": 1,
"width": 490,
"bit_width": 8,
"precision": 0,
"layers": [
{
"size": 490,
"type": "dense",
"activation": "linear",
"input_width": 8,
"output_width": 32
},
...
]
}Supported layer types:
dense- Fully connected layerconv- Convolutional layer (1D)thresh- Multi-threshold activationTopK- Argmax/Top-K selectionmaxpool/avgpool- Pooling layers
Template system: The generator uses string templates from string_templates.py:
PT_TEMPLATE- Parameter table templateKS_TEMPLATE- Kernel size template- Network and ROM templates for VHDL code generation
- DO scripts perform complete compilation (vcom) and simulation invocation (vsim).
- Use the appropriate DO file per component, e.g. simulation/scripts/sim_argmax.do.
- Waveform helpers: the project ships many
.dowave scripts in simulation/scripts/ to set up helpfuladd wavelines — examples include: - If the waveform is too large, edit the
wave_*.dofile and remove or comment outadd waveentries you do not need.
The framework has been validated using a keyword spotting (KWS) neural network from the QONNX Model Zoo.
Simulation Results:
-
Dataset: Google Speech Commands V2 [GSCV2] (https://github.com/Xilinx/finn-examples/blob/main/finn_examples/data/all_validation_kws_data_preprocessed_py_speech.zip.link)
-
Accuracy: 88% (cycle-accurate simulation)
-
Tool: ModelSim/Questa Additional audio samples can be downloaded from the finn-examples repository (https://github.com/Xilinx/finn-examples/blob/main/finn_examples/data/audio_samples.zip.link) Synthesis Results:
-
Tool: Xilinx Vivado ML 2025.1
-
Target: Zynq Ultrascale+ ZU9EG
-
Resource Usage (post-synthesis):
- LUT: 36%
- LUTRAM: 1%
- Flip-Flops: 40%
- BRAM: 5%
- DSP: 0%
The framework successfully demonstrates bit-accurate equivalence between the QONNX model and synthesized hardware implementation.
Standard development cycle:
-
Convert QONNX model:
conv = ConvertQONNX("path/to/model.onnx") conv.cleanup_qonnx()
-
Generate VHDL network and ROMs:
net_gen = NetworkGenerator("config.json", model_path="network.onnx") net_gen.run()
Or simply run:
python usage_example.py -
Prepare test stimuli:
stim = StimuliPrep(model_path="network.onnx", config_file="config.json") stim.create_data_file()
-
Run simulation:
cd simulation/scripts # In ModelSim/Questa console: do sim_qonnxtest.do
Inspecting outputs:
- Generated VHDL:
source/directory - Generated ROMs:
source/wb_rom_*.vhd - Network configuration:
simulation/data/*.json - Simulation results:
simulation/data/results.csv - Waveforms: ModelSim/Questa GUI
QONNX conversion issues:
Problem: Unexpected node order or missing quantization nodes after conversion.
Solution: The converter runs multiple FINN/ONNX transformations. Check the transformation pipeline in ConvertQONNX.cleanup_qonnx and verify your QONNX model is properly quantized.
Precision and signedness errors:
Problem: Incorrect bit-widths or signed/unsigned mismatches in generated VHDL.
Solution: Layer precision flags are extracted from QONNX node attributes in NetworkGenerator.create_config. Verify the UINT*/INT* patterns in your model match expected data types.
Waveform display issues:
Problem: Waveform zoom/cursor settings look incorrect between simulations.
Solution: Some .do files set absolute zoom values (e.g., WaveRestoreZoom {... ps}). Edit the wave script to use relative cursors or remove the zoom command. Example: simulation/scripts/wave_padding_interface.do
Simulation compilation errors:
Problem: VHDL compilation fails in ModelSim/Questa.
Solution:
- Ensure all ROMs are generated: check for
source/wb_rom_*.vhdfiles - Verify the correct DO script is used for your network architecture
- Check that all dependencies in
source/are present
Python import errors:
Problem: Cannot import QONNX or FINN modules.
Solution:
- Verify FINN installation:
python -c "import finn" - Check Python version:
python --version(should be ~3.12) - Reinstall dependencies:
pip install -e .
Contributions are welcome! Here's how to extend the framework:
Before contributing:
- Install dev dependencies:
pip install -e ".[dev]" - Format your code before committing:
make fmt - Verify formatting:
make check
Adding new layer types:
- Implement VHDL entity in
source/ - Add layer logic to
NetworkGenerator - Create testbench in
simulation/ - Add corresponding
.doscript insimulation/scripts/ - Format your code:
make fmt
Adding tests:
- Place testbench VHDL files in
simulation/ - Create simulation scripts in
simulation/scripts/ - Add Python test scripts for stimuli generation
- Follow the code style guidelines (enforced by formatters)
Code Standards:
- Python: Follow PEP 8 (enforced by Ruff)
- VHDL: 2-space indentation, vhdl-beautify compliant
- Line length: 100 characters max
- All files: UTF-8 encoding, LF line endings
- See Code Standards & Formatting for details
Improving automation:
- CLI interface for common workflows
- Automated test runner for regression testing
- Integration with CI/CD pipelines
- Pre-commit hooks for automatic formatting
Current status: This framework is provided as-is for research and educational purposes.
Known limitations:
- Not all layer type combinations have been extensively tested
- Some layer configurations may produce unexpected results
- Limited error handling in the generation pipeline
- FINN dependency requires Docker setup
Tested configurations:
- Keyword spotting network (KWS) - fully validated
- Dense + threshold layers - validated
Intended use: This framework provides a starting point for QONNX-to-VHDL conversion and demonstrates a minimal working implementation for neural network acceleration on FPGAs. Further development and testing are expected for production use.
- source/README.md - Detailed VHDL architecture documentation
- simulation/README.md - Simulation directory structure
- usage_example.py - Complete workflow example
- convert_qonnx.py -
ConvertQONNXclass - QONNX_Network_gen.py -
NetworkGeneratorclass - prepare_stim.py -
StimuliPrepclass - new_config_data.py -
ConfigReaderclass - helper.py - Utility functions
- string_templates.py - VHDL generation templates
- QONNX - Quantized ONNX framework
- FINN - Framework for quantized neural networks on FPGAs
- QONNX Model Zoo - Pre-trained quantized models
- Google Speech Commands Dataset - Audio dataset for keyword spotting
