BLADEI: Bitstream-Level Abnormality Detection for Embedded Inference

Copyright (c) 2025, Rye Stahle-Smith

📌 Project Overview

This repository contains an embedded deployment pipeline for detecting malicious FPGA bitstreams using a trained machine learning (ML) model. Bitstreams are configuration files that can be weaponized to introduce hardware Trojans, posing serious risks in shared or cloud-hosted reconfigurable systems. This project leverages a lightweight, byte-level classification approach and enables on-device malware detection for PYNQ-supported FPGA boards, without requiring reverse engineering techniques or access to original source code or netlists. Benchmark designs, including AES-128 and RS232 variants, were obtained from Trust-Hub, then synthesized, implemented, and categorized as benign, malicious, or empty .bit files

⚙️ Features

🔍 Byte-frequency analysis of binary .bit files
🧩 Lightweight byte-level + statistical feature extraction
📊 Real-time inference using a trained Random Forest with a custom, dependency-light predictor
⚡ Deployment-ready for ARMv7 (e.g., PYNQ-Z1/Z2, Zynq-7000 SoC) and ARMv8 (e.g., Zynq UltraScale+ MPSoC, RFSoC, Kria) boards
🧪 Verified with state-of-the-art (SOTA) bitstreams derived from Trust-Hub benchmarks

📂 Repository Structure

pynq-maldetect/
├── trusthub_bitstreams/ # Sample .bit files (Benign, Malicious, Empty)
├── model_components/ # Quantized ML model components
├── train_model.py # Model training and export for PYNQ
├── deploy_model.py # Model deployment for on-device inference
├── requirements.txt # Python dependencies
├── LICENSE.md
└── README.md

🚀 Setup Instructions

This project is divided into two parts:

🧠 Model Training and Export
⚙️ On-Device Inference

🧠 `train_model.py` — Model Training and Export

Requirements:

Python 3.8+

Python Packages: scikit-learn, numpy, scipy

⚠️ Note: Training should be performed on a general-purpose machine (laptop, workstation, or server) for both ARMv7 and ARMv8 targets. While some ARMv8 boards may be capable of training, it is not the recommended workflow. Training is heavier, package availability can be inconsistent, and it’s typically slower and less reproducible than running on a PC.

Clone the Repository:

git clone https://github.com/Bread2002/PYNQ_BLADEI.git
cd PYNQ_BLADEI

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Training Script:
```
python train_model.py
```

Features:

Byte-frequency feature extraction from .bit files (256-dimensional normalized histogram)
Statistical augmentation features (e.g., mean, std, skew, kurtosis, entropy, density metrics)
Training and evaluation using k-Fold Cross-Validation
Best-performing model exported as compact artifacts (JSON + NumPy arrays) and bundled into a .tar.gz archive for PYNQ deployment

⚙️ `deploy_model.py` — On-Device Inference

Requirements:

A supported FPGA board with PYNQ v3.1

Quantized model components (via on-board training or exported archive)

Import the Archive to your PYNQ board via Jupyter Notebook or SSH/SFTP

Decompress the Archive:

mkdir PYNQ_BLADEI
tar -xvzf PYNQ_BLADEI.tar.gz -C ./PYNQ_BLADEI
rm PYNQ_BLADEI.tar.gz
cd PYNQ_BLADEI

Run the Deployment Script:
```
python deploy_model.py
```

Features:

Loads .bit files from local storage
Extracts byte-frequency + statistical features
Predicts class (Benign, Malicious, or Empty) using the trained model
Displays prediction result with latency breakdown:
- Load time
- Feature extraction time
- Inference time
Quarantines suspicious bitstreams

📈 Sample Output of Mock Deployment Pipeline

Benign Bitstream

======= BLADEI Vetting: =======
Processing bitstream: AES-T2100_TjFree_20251218_085702.bit

Actual Class: Benign AES (Class 1)
Predicted Class: Benign AES (Class 1) [94.67% Confidence]

ACTION: Bitstream passed vetting. Proceed to deployment.

======= Latency Summary: =======
Load Bitstream: 24.14 ms
Feature Extraction: 6124.15 ms
Prediction: 69.33 ms

Total Latency: 6.22 s

======= System Information: =======
System: Linux
Node Name: pynq
Release: 6.6.10-xilinx-v2024.1-g08e597ec1786
Version: #1 SMP PREEMPT Sat Apr 27 05:22:24 UTC 2024
Machine: armv7l
Processor: armv7l

======= CPU Information: =======
CPU Cores: 2
Logical Processors: 2
CPU Usage per Core: [0.4, 99.9]
Total RAM: 491.6640625 MB

Malicious Bitstream

======= BLADEI Vetting: =======
Processing bitstream: AES-T500_TjIn_20251218_163136.bit

Actual Class: Malicious AES (Class 3)
Predicted Class: Malicious AES (Class 3) [90.00% Confidence]

ACTION: Bitstream quarantined -> ./mock_deployment/Quarantine/AES-T500_TjIn_20251218_163136.bit
ACTION: Deployment blocked.

======= Latency Summary: =======
Load Bitstream: 105.69 ms
Feature Extraction: 6105.57 ms
Prediction: 68.96 ms

Total Latency: 6.28 s

======= System Information: =======
System: Linux
Node Name: pynq
Release: 6.6.10-xilinx-v2024.1-g08e597ec1786
Version: #1 SMP PREEMPT Sat Apr 27 05:22:24 UTC 2024
Machine: armv7l
Processor: armv7l

======= CPU Information: =======
CPU Cores: 2
Logical Processors: 2
CPU Usage per Core: [0.5, 98.8]
Total RAM: 491.6640625 MB

Empty Bitstream

======= BLADEI Vetting: =======
Processing bitstream: empty2_Empty_20251219_013937.bit

Actual Class: Empty (Class 0)
Predicted Class: Empty (Class 0) [70.67% Confidence]

ACTION: Bitstream quarantined -> ./mock_deployment/Quarantine/empty2_Empty_20251219_013937.bit
ACTION: Deployment blocked.

======= Latency Summary: ======= Load Bitstream: 178.80 ms
Feature Extraction: 6140.21 ms
Prediction: 58.70 ms

Total Latency: 6.38 s

======= System Information: =======
System: Linux
Node Name: pynq
Release: 6.6.10-xilinx-v2024.1-g08e597ec1786
Version: #1 SMP PREEMPT Sat Apr 27 05:22:24 UTC 2024
Machine: armv7l
Processor: armv7l

======= CPU Information: =======
CPU Cores: 2
Logical Processors: 2
CPU Usage per Core: [0.7, 98.1]
Total RAM: 491.6640625 MB

🤝 Acknowledgments

The authors were pleased to have this work accepted for presentation at the 37th annual ACM/ IEEE Supercomputing Conference. This work was supported by the McNair Junior Fellowship and Office of Undergraduate Research at the University of South Carolina. OpenAl's ChatGPT assisted with language and grammar correction. While this project utilizes benchmark designs from Trust-Hub, a resource sponsored by the National Science Foundation (NSF), all technical content and analysis were independently developed by the authors. This research also utilized PYNQ, provided by AMD and Xilinx, whose tools and hardware facilitated the synthesis and deployment stages of this study. Access to the FPGA devices was made possible through the AMD University Program.

🛠️ Future Work

Expand the current dataset with more SOTA benchmarks (ISCAS'85, ISCAS'89, ITC'02, and ITC'99)
Add a CNN-based image classification model to authenticate ML predictions
~~Implement a mock cloud-to-edge bitstream deployment pipeline~~
~~Improve detection latency with quantized models~~
~~Expand support for additional FPGA boards~~

🖊️ References

AMD. (2024). PYNQ: Python Productivity for Zynq. Retrieved from https://www.pynq.io

Benz, F., Seffrin, A., & Huss, S. A. (2012). BIL: A Tool-Chain for Bitstream Reverse-Engineering. Proceedings of the IEEE International Conference on Field Programmable Logic and Applications (FPL), 735–738. IEEE.

Chawla, N., Bowyer, K., Hall, L., & Kegelmeyer, W. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.

Elnaggar, R., & Chakrabarty, K. (2018). Machine Learning for Hardware Security: Opportunities and Risks. Journal of Electronic Testing, 34(2), 183–201.

Elnaggar, R., Chaudhuri, J., Karri, R., & Chakrabarty, K. (2023). Learning Malicious Circuits in FPGA Bitstreams. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(3), 726–739. Retrieved from https://ieeexplore.ieee.org/document/9828544/

Hayashi, V. T., & Ruggiero, W. V. (2025). Hardware Trojan Detection in Open-Source Hardware Designs Using Machine Learning. IEEE Access. Retrieved from https://ieeexplore.ieee.org/document/10904479/

Imbalanced-learn Developers. (2024). SMOTE. Retrieved from https://bit.ly/3IXc0l7

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, E. (2011). scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830. Retrieved from https://dl.acm.org/doi/10.5555/1953048.2078195

Salmani, H., Tehranipoor, M., & Karri, R. (2013). On Design Vulnerability Analysis and Trust Benchmark Development. Proceedings of the IEEE International Conference on Computer Design (ICCD). IEEE.

Scikit-learn Developers. (2025a). Cross Validation. Retrieved from https://bit.ly/3gct8QG

Scikit-learn Developers. (2025b). Truncated SVD. Retrieved from https://bit.ly/4mmi4BT

Seo, Y., Yoon, J., Jang, J., Cho, M., Kim, H.-K., & Kwon, T. (2018). Poster: Towards Reverse Engineering FPGA Bitstreams for Hardware Trojan Detection. Proceedings of the Network and Distributed System Security Symposium (NDSS), 18–21. Internet Society.

Shakya, B., He, T., Salmani, H., Forte, D., Bhunia, S., & Tehranipoor, M. (2017). Benchmarking of Hardware Trojans and Maliciously Affected Circuits. Journal of Hardware and Systems Security.

Yoon, J., Seo, Y., Jang, J., Cho, M., Kim, J., Kim, H., & Kwon, T. (2018). A Bitstream Reverse Engineering Tool for FPGA Hardware Trojan Detection. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2318–2320. Presented at the Toronto, Canada. doi:10.1145/3243734.3278487

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BLADEI: Bitstream-Level Abnormality Detection for Embedded Inference

Copyright (c) 2025, Rye Stahle-Smith

📌 Project Overview

⚙️ Features

📂 Repository Structure

🚀 Setup Instructions

🧠 `train_model.py` — Model Training and Export

Features:

⚙️ `deploy_model.py` — On-Device Inference

Features:

📈 Sample Output of Mock Deployment Pipeline

Benign Bitstream

Malicious Bitstream

Empty Bitstream

🤝 Acknowledgments

🛠️ Future Work

🖊️ References

About

Uh oh!

Releases 4

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
model_components		model_components
trusthub_bitstreams		trusthub_bitstreams
LICENSE.md		LICENSE.md
README.md		README.md
deploy_model.py		deploy_model.py
requirements.txt		requirements.txt
train_model.py		train_model.py

License

Bread2002/PYNQ_BLADEI

Folders and files

Latest commit

History

Repository files navigation

BLADEI: Bitstream-Level Abnormality Detection for Embedded Inference

Copyright (c) 2025, Rye Stahle-Smith

📌 Project Overview

⚙️ Features

📂 Repository Structure

🚀 Setup Instructions

🧠 train_model.py — Model Training and Export

Features:

⚙️ deploy_model.py — On-Device Inference

Features:

📈 Sample Output of Mock Deployment Pipeline

Benign Bitstream

Malicious Bitstream

Empty Bitstream

🤝 Acknowledgments

🛠️ Future Work

🖊️ References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

🧠 `train_model.py` — Model Training and Export

⚙️ `deploy_model.py` — On-Device Inference

Packages