A fully documented, production‑grade README for an educational neural‑network forward‑pass implementation using NumPy and the nnfs spiral dataset. This document thoroughly describes the architecture, workflow, stability techniques, design decisions, and recommended extensions.
This project implements a clean, minimal, and pedagogically structured neural‑network forward pass. It demonstrates two fully connected layers, ReLU nonlinearity, Softmax classification, and categorical cross‑entropy loss — all built from scratch using NumPy to illustrate core deep‑learning mechanics.
The objective is clarity, readability, and professional documentation suitable for academic submissions, interviews, and portfolio demonstration.
- Dense (fully connected) neural‑network layers
- ReLU activation with efficient vectorized implementation
- Softmax activation with industry‑standard numerical stability corrections
- Categorical cross‑entropy loss with support for both integer and one‑hot encoded labels
- Deterministic behaviour using controlled random seeds
- Fully modular architecture for future extension into a trainable model
| Component | Purpose |
|---|---|
| Python 3.8+ | Core language |
| NumPy | Linear algebra, matrix operations |
| nnfs package | Pre‑built spiral dataset for classification demonstration |
Install dependencies:
pip install numpy nnfsSuggested layout:
project_root/
│
├── simple_nn.py # Main neural‑network script
├── README.md # Documentation (this file)
└── requirements.txt # Optional dependency list
nnfs.init()
np.random.seed(0)- Ensures reproducibility during development and demonstrations.
- Prevents inconsistent results across runs.
self.output = np.dot(inputs, self.weights) + self.biasesPurpose: Implements the affine transformation (XW + b), fundamental to neural networks.
Design choices:
- Weights drawn from
N(0, 0.1)maintain reasonable starting activation ranges. - Biases initialized to zero (industry standard for dense layers).
self.output = np.maximum(0, inputs)- Efficient, vectorized, and resistant to vanishing gradients.
- Introduces nonlinearity allowing the network to learn non‑linear decision boundaries.
exp_values = np.exp(inputs - np.max(inputs, axis=1, keepdims=True))
probabilities = exp_values / np.sum(exp_values, axis=1, keepdims=True)Professional‑grade numerical stability:
- Subtracting the row‑wise maximum prevents overflow in the exponential function.
- Produces reliable probability distributions even on high‑magnitude logits.
Supports both labeling schemes:
- Class indices:
[0, 2, 1, ...] - One‑hot vectors:
[[1,0,0], [0,1,0], ...]
Uses clipping to avoid undefined logarithmic values:
y_pred_clipped = np.clip(y_pred, 1e-7, 1 - 1e-7)Computes negative log‑likelihood — a standard measure of classification confidence.
- Generate dataset (spiral, non‑linearly separable)
- Pass through Layer 1 (Dense)
- Apply ReLU activation
- Pass through Layer 2 (Dense)
- Apply Softmax activation
- Compute loss against true labels
This sequence corresponds exactly to how neural networks operate in major frameworks (TensorFlow, PyTorch).
activation2.output[:5] produces the first five probability vectors.
Each vector:
- contains 3 values (since there are 3 classes),
- sums to 1,
- reflects the model’s confidence for each class.
The printed loss typically falls within a reasonable range (≈1.0–1.5 for untrained networks).
This implementation includes all industry‑required stability safeguards:
- Softmax overflow prevention
- Clipping before logarithmic operations
- Controlled weight magnitude
- Deterministic random seeds
These practices ensure the model behaves predictably and avoids catastrophic numerical errors.
If evolving this into a full training system:
- Implement derivative propagation for each layer and activation.
- Compute gradients:
dweights,dbiases,dinputs.
- SGD (with momentum)
- RMSProp
- Adam (industry standard)
- L2 weight decay
- Dropout
- Batch normalization
- Decision boundaries
- Loss curves
- Accuracy curves
This project is released under the MIT License — free for education, research, and commercial modification.
~Varun D Soni