Skip to content

varunnnnsonii/NeuralNetworksScratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Network Forward Pass — Professional Documentation

A fully documented, production‑grade README for an educational neural‑network forward‑pass implementation using NumPy and the nnfs spiral dataset. This document thoroughly describes the architecture, workflow, stability techniques, design decisions, and recommended extensions.


1. Overview

This project implements a clean, minimal, and pedagogically structured neural‑network forward pass. It demonstrates two fully connected layers, ReLU nonlinearity, Softmax classification, and categorical cross‑entropy loss — all built from scratch using NumPy to illustrate core deep‑learning mechanics.

The objective is clarity, readability, and professional documentation suitable for academic submissions, interviews, and portfolio demonstration.


2. Key Capabilities

  • Dense (fully connected) neural‑network layers
  • ReLU activation with efficient vectorized implementation
  • Softmax activation with industry‑standard numerical stability corrections
  • Categorical cross‑entropy loss with support for both integer and one‑hot encoded labels
  • Deterministic behaviour using controlled random seeds
  • Fully modular architecture for future extension into a trainable model

3. Technology Stack

Component Purpose
Python 3.8+ Core language
NumPy Linear algebra, matrix operations
nnfs package Pre‑built spiral dataset for classification demonstration

Install dependencies:

pip install numpy nnfs

4. Project Structure

Suggested layout:

project_root/
│
├── simple_nn.py        # Main neural‑network script
├── README.md           # Documentation (this file)
└── requirements.txt    # Optional dependency list

5. Step‑by‑Step Conceptual Breakdown

5.1 Initialization and Determinism

nnfs.init()
np.random.seed(0)
  • Ensures reproducibility during development and demonstrations.
  • Prevents inconsistent results across runs.

5.2 Dense Layer — Core Linear Transformation

self.output = np.dot(inputs, self.weights) + self.biases

Purpose: Implements the affine transformation (XW + b), fundamental to neural networks.

Design choices:

  • Weights drawn from N(0, 0.1) maintain reasonable starting activation ranges.
  • Biases initialized to zero (industry standard for dense layers).

5.3 ReLU Activation — Nonlinearity

self.output = np.maximum(0, inputs)
  • Efficient, vectorized, and resistant to vanishing gradients.
  • Introduces nonlinearity allowing the network to learn non‑linear decision boundaries.

5.4 Softmax Activation — Probabilistic Output Layer

exp_values = np.exp(inputs - np.max(inputs, axis=1, keepdims=True))
probabilities = exp_values / np.sum(exp_values, axis=1, keepdims=True)

Professional‑grade numerical stability:

  • Subtracting the row‑wise maximum prevents overflow in the exponential function.
  • Produces reliable probability distributions even on high‑magnitude logits.

5.5 Categorical Cross‑Entropy Loss

Supports both labeling schemes:

  • Class indices: [0, 2, 1, ...]
  • One‑hot vectors: [[1,0,0], [0,1,0], ...]

Uses clipping to avoid undefined logarithmic values:

y_pred_clipped = np.clip(y_pred, 1e-7, 1 - 1e-7)

Computes negative log‑likelihood — a standard measure of classification confidence.


6. Full Forward‑Pass Workflow

  1. Generate dataset (spiral, non‑linearly separable)
  2. Pass through Layer 1 (Dense)
  3. Apply ReLU activation
  4. Pass through Layer 2 (Dense)
  5. Apply Softmax activation
  6. Compute loss against true labels

This sequence corresponds exactly to how neural networks operate in major frameworks (TensorFlow, PyTorch).


7. Example Output Interpretation

activation2.output[:5] produces the first five probability vectors.

Each vector:

  • contains 3 values (since there are 3 classes),
  • sums to 1,
  • reflects the model’s confidence for each class.

The printed loss typically falls within a reasonable range (≈1.0–1.5 for untrained networks).


8. Professional Notes on Numerical Stability

This implementation includes all industry‑required stability safeguards:

  • Softmax overflow prevention
  • Clipping before logarithmic operations
  • Controlled weight magnitude
  • Deterministic random seeds

These practices ensure the model behaves predictably and avoids catastrophic numerical errors.


9. Recommended Professional Extensions

If evolving this into a full training system:

9.1 Add Backpropagation:

  • Implement derivative propagation for each layer and activation.
  • Compute gradients: dweights, dbiases, dinputs.

9.2 Add Optimizers:

  • SGD (with momentum)
  • RMSProp
  • Adam (industry standard)

9.3 Add Regularization:

  • L2 weight decay
  • Dropout
  • Batch normalization

9.4 Visualization:

  • Decision boundaries
  • Loss curves
  • Accuracy curves

11. Licensing

This project is released under the MIT License — free for education, research, and commercial modification.


~Varun D Soni

About

building and learning NNs from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages