Skip to content

A NumPy-only Multilayer Perceptron (MLP) built from first principles to demystify neural networks, backpropagation, and optimization algorithms.

License

Notifications You must be signed in to change notification settings

VishalParmar07/Learning_How_Machine_Learns-MLP

Repository files navigation

🧠 Multilayer perceptron — Built From First Principles (NumPy Only)

This repository contains a from-scratch implementation of a Multilayer Perceptron (MLP) using only NumPy, built with the goal of demystifying how modern deep learning frameworks work under the hood.

Instead of treating backpropagation, optimization, batch normalization, and dropout as black boxes, this project exposes their explicit mathematical formulation and vectorized implementation, end-to-end.

❌ Not meant to replace PyTorch / TensorFlow
✅ Meant to help you deeply understand them

🎯 Who Is This For?

This project is intended for:

  • Students learning neural networks from first principles
  • Engineers who use PyTorch / TensorFlow and want to understand what happens behind .backward()
  • Learners transitioning from math → implementation
  • Anyone curious about how Adam, BatchNorm, Dropout, and regularization are actually implemented

If your goal is intuition + implementation, this repo is for you.

💡 Motivation

I come from a mechanical engineering background, not software.

Modern AI and large language models sparked my curiosity about the mathematics governing neural networks, which led me to implement an MLP framework from scratch to bridge the gap between theory and code.

This project represents that learning journey.

🚀 Features Implemented

  • Fully vectorized forward and backward propagation
  • Declarative network definition (layer-wise configuration)
  • Automatic loss selection based on output activation
  • Weight initialization auto-selected based on activation
  • Batch Normalization (applied before activation)
  • Dropout regularization
  • L2 regularization
  • Optimizers:
    • Stochastic Gradient Descent (SGD)
    • Adam (Momentum + RMSProp)
  • Mini-batch training
  • Task-aware evaluation metrics:
    • Regression → MSE
    • Binary classification → Accuracy, Precision, Recall, F1
    • Multiclass classification → Accuracy, Precision, Recall, F1

⚠️ Important Notes

  • Input shape is expected as (features, number of examples)
    (Unlike most modern frameworks which use (examples, features))
  • Code is intentionally not PEP8-compliant
    (Capital letters are used to visually distinguish matrices)
  • This is a learning-oriented framework, not a production library
  • Learning roadmap followed: AssemblyAI ML Study Guide

🧩 Network Definition

The network architecture is defined using a list of tuples, one per layer.

General format

(neurons, batchnorm, activation, keep_prob)
layer_construction = [
(784, False, 'ReLu',    1),
(128, True,  'ReLu',  0.8),
(64,  True,  'ReLu',  0.8),
(10,  False, 'softmax', 1)]

Notes
Input layer parameters are placeholders for structural consistency Output layer defaults:
batchnorm = False
keep_prob = 1

🔧 Supported Activations

'ReLu' (case-sensitive 😅)
'sigmoid'
'tanh'
'softmax'
'Linear' (for regression output)

Output Layer Constraints

Activation Loss Function
softmax Categorical Cross-Entropy
sigmoid Binary Cross-Entropy
Linear Mean Squared Error

Loss is auto-selected based on the output activation.

🏗 Creating the Model

MLP = Neural_net.Multilayerperceptron(epoch=4, lr= 0.01, cost_history=True, Regularization=False, lamda=0.001, layer_construct= layer_construction, minibatch_size=64, adam=True)   

Configurable Options

Learning rate
Number of epochs
Mini-batch size
L2 regularization (lamda)
Optimizer selection (Adam / SGD)
Adam hyperparameters: beta1, beta2 epsilon

▶️ Training & Prediction

Train

MLP.fit(x_train_reshape, y_train_reshape)   

Predict probabilities

A_pred = MLP.predict(x_train_reshape)

Predict classes (classification only)

P_pred = MLP.predict_class(x_train_reshape)

📊 Evaluation

Evaluation metrics are automatically selected based on task type.

train_Accuracy, train_precision, train_recall, train_F1 = MLP.matrix_eval(y_train_reshape, P_pred)

🧪 Datasets Tested

This framework has been tested on:

  • MNIST → Multiclass classification
  • Breast Cancer Dataset → Binary classification
  • California Housing Dataset → Regression

Test notebooks:

  • Test_multiclass_classification.ipynb
  • Test_binary_classification.ipynb
  • Test_Regression.ipynb

Note: Validation splits are intentionally omitted to focus on framework capability rather than model tuning.

🧩 Project Structure & Design

  • Utility.py
    • Parameter initialization
    • Activation functions & gradients
    • Mini-batch logic
    • Single-layer forward & backward building blocks
  • Neural_net.py
    • Multilayer orchestration
    • Forward pass
    • Cost computation
    • Backpropagation
    • Parameter updates
    • Predict
    • Matrix Evaluation

The core challenge is correct gradient computation, especially for BatchNorm, which often requires careful pen-and-paper derivation before implementation.

📚 Learning Resources

Highly recommended:

🛠 Installation

Core implementation uses NumPy only. Test notebooks use TensorFlow and scikit-learn only for dataset loading.

  • Python: 3.10.10 recommended for Tensorflow
git clone https://github.com/VishalParmar07/Learning_How_Machine_Learns-MLP.git
cd Learning_How_Machine_Learns-MLP
pip install -r requirements.txt

⭐ Final Note

If this repository helped you understand neural networks better, consider starring it. Feedback and discussions are always welcome