Skip to content

Temporal mask stabilization system for video segmentation (anti-flicker). DeepLabv3 + FastAPI + temporal smoothing.

Notifications You must be signed in to change notification settings

NickScherbakov/mask-stabilization

Repository files navigation

Mask Stabilization System

Автор: Π§ΡƒΠ±Π°Ρ€ΠΎΠ²Π° Π”Π°Ρ€ΡŒΡ АлСксССвна

A complete system for reducing flickering in frame-by-frame video segmentation using temporal smoothing techniques.

🌐 Project Presentation

Live Demo: https://nickscherbakov.github.io/mask-stabilization/

A comprehensive presentation website (in Russian) showcasing:

  • Problem statement and visual explanations
  • System architecture and technologies
  • Stabilization methods with formulas
  • Experimental results and metrics
  • API documentation
  • Q&A section for homework defense

See docs/GITHUB_PAGES_SETUP.md for GitHub Pages setup instructions.

πŸ“‹ Overview

This project implements a full pipeline for:

  1. Video Segmentation using DeepLabv3 (PyTorch/torchvision)
  2. Mask Stabilization with multiple temporal smoothing methods
  3. Metrics Calculation to measure stability improvements
  4. REST API for easy integration (FastAPI)
  5. Interactive Analysis with Jupyter notebooks

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Upload    │─────▢│ Segmentation │─────▢│  Stabilization  β”‚
β”‚    Video    β”‚      β”‚  (DeepLabv3) β”‚      β”‚   (Temporal)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚                        β”‚
                            β”‚                        β”‚
                            β–Ό                        β–Ό
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚    Masks     β”‚        β”‚  Smoothed   β”‚
                     β”‚   (Before)   β”‚        β”‚   Masks     β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                     β”‚
                                                     β”‚
                                                     β–Ό
                                             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                             β”‚   Metrics   β”‚
                                             β”‚ Calculation β”‚
                                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

mask-stabilization/
β”œβ”€β”€ README.md                    # This file
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ Dockerfile                   # Docker container setup
β”œβ”€β”€ docker-compose.yml           # Docker composition
β”‚
β”œβ”€β”€ docs/                        # Presentation website (GitHub Pages)
β”‚   β”œβ”€β”€ index.html               # Main presentation page (Russian)
β”‚   └── GITHUB_PAGES_SETUP.md    # Setup instructions
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py                  # FastAPI server
β”‚   β”œβ”€β”€ segmentation.py          # DeepLabv3 segmentation
β”‚   β”œβ”€β”€ stabilization.py         # Temporal smoothing methods
β”‚   β”œβ”€β”€ metrics.py               # Stability metrics (IoU, etc.)
β”‚   └── utils.py                 # Utility functions
β”‚
β”œβ”€β”€ notebooks/
β”‚   └── analysis.ipynb           # Interactive analysis notebook
β”‚
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ index.html               # Web frontend (HTML/CSS/JS)
β”‚   └── README.md                # Frontend documentation
β”‚
β”œβ”€β”€ spark_frontend/
β”‚   └── SPARK_PROMPT.md          # GitHub Spark frontend prompt
β”‚
β”œβ”€β”€ examples/
β”‚   └── .gitkeep                 # Place test videos here
β”‚
└── results/
    └── .gitkeep                 # Processing results stored here

πŸš€ Installation

Option 1: Local Installation

  1. Clone the repository

    git clone https://github.com/NickScherbakov/mask-stabilization.git
    cd mask-stabilization
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt

Option 2: Docker

  1. Build and run with Docker Compose
    docker-compose up --build

The API will be available at http://localhost:8000

πŸ“– Usage

Running the API Server

# From the project root directory
uvicorn src.main:app --host 0.0.0.0 --port 8000 --reload

Access the API documentation at http://localhost:8000/docs

Using the Jupyter Notebook

jupyter notebook notebooks/analysis.ipynb

The notebook provides a step-by-step demonstration of the entire pipeline.

API Endpoints

1. Upload Video

curl -X POST "http://localhost:8000/api/upload" \
  -F "file=@path/to/video.mp4"

Response:

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "uploaded",
  "video_info": {
    "fps": 30.0,
    "frame_count": 150,
    "width": 1920,
    "height": 1080
  }
}

2. Segment Video

curl -X POST "http://localhost:8000/api/segment" \
  -H "Content-Type: application/json" \
  -d '{"job_id": "YOUR_JOB_ID", "target_class": "person"}'

3. Apply Stabilization

curl -X POST "http://localhost:8000/api/stabilize" \
  -H "Content-Type: application/json" \
  -d '{
    "job_id": "YOUR_JOB_ID",
    "method": "moving_average",
    "window_size": 5
  }'

4. Get Status

curl "http://localhost:8000/api/status/YOUR_JOB_ID"

5. Get Metrics

curl "http://localhost:8000/api/metrics/YOUR_JOB_ID"

Response:

{
  "iou_before": {
    "mean": 0.8234,
    "std": 0.0521,
    "min": 0.6543,
    "max": 0.9876
  },
  "iou_after": {
    "mean": 0.9123,
    "std": 0.0234,
    "min": 0.8234,
    "max": 0.9912
  },
  "improvement": {
    "iou_improvement": 0.0889,
    "iou_improvement_percent": 10.8,
    "instability_reduction_percent": 57.3
  }
}

6. Get Frame Image

curl "http://localhost:8000/api/frames/YOUR_JOB_ID/comparison/25" \
  --output frame_25.png

Frame types: mask_before, mask_after, comparison

Available Segmentation Classes

{
    0: 'background',
    15: 'person',
    7: 'car',
    6: 'bus',
    8: 'truck',
    9: 'boat',
    17: 'cat',
    18: 'dog',
    19: 'horse',
    20: 'sheep',
    21: 'cow'
}

πŸ”¬ Stabilization Methods

1. Moving Average

Averages probability maps over a temporal window.

Parameters:

  • window_size: 3, 5, 7, or 9 (must be odd)

Formula:

smoothed[i] = mean(masks[i-w:i+w+1])

Use case: General-purpose smoothing, good balance between smoothness and responsiveness.

2. Median Filter

Computes median across temporal window for each pixel.

Parameters:

  • window_size: 3, 5, 7, or 9 (must be odd)

Formula:

smoothed[i] = median(masks[i-w:i+w+1])

Use case: Robust to outliers, preserves sharp edges better than moving average.

3. Exponential Smoothing

Weighted average giving more importance to recent frames.

Parameters:

  • alpha: 0.1 to 0.9 (smoothing factor)
    • Lower Ξ± = more smoothing
    • Higher Ξ± = more responsive

Formula:

smoothed[t] = Ξ± * original[t] + (1-Ξ±) * smoothed[t-1]

Use case: Adaptive smoothing, good for varying motion speeds.

πŸ“Š Metrics

IoU (Intersection over Union)

Measures overlap between consecutive frames:

IoU = |A ∩ B| / |A βˆͺ B|

Higher IoU = more temporal consistency

Instability Score

Instability = 1 - IoU

Higher instability = more flickering

Metrics Computed

  • Mean IoU: Average consistency across all frame transitions
  • IoU Standard Deviation: Variability in consistency
  • Instability Reduction: Percentage decrease in flickering
  • Min/Max IoU: Range of consistency values

🎨 Frontend

A clean, modern web interface is available in the frontend/ directory.

Quick Start

  1. Start the API server:

    uvicorn src.main:app --host 0.0.0.0 --port 8000
  2. Open the frontend:

    • Simply open frontend/index.html in a web browser, or
    • Serve it with a simple HTTP server:
      cd frontend
      python -m http.server 8080
    • Navigate to http://localhost:8080/index.html

Features

The frontend provides:

  • Drag-and-drop video upload with format validation
  • Real-time processing status with progress tracking
  • Interactive frame viewer with navigation controls
  • Metrics visualization showing IoU improvements
  • Configuration options for object classes and stabilization methods
  • Responsive design that works on all screen sizes

See frontend/README.md for detailed documentation.

For an alternative GitHub Spark interface, see the prompt in spark_frontend/SPARK_PROMPT.md.

πŸ” Example Workflow

  1. Upload a video
  2. Segment targeting "person" class
  3. Apply moving average with window_size=5
  4. Visualize results:
    • Frame-by-frame comparison
    • IoU improvement chart
    • Quantitative metrics

πŸ“ˆ Expected Results

Typical improvements with window_size=5:

Metric Before After Improvement
Mean IoU 0.823 0.912 +10.8%
IoU Std 0.052 0.023 -55.8%
Instability 0.177 0.088 -50.3%

πŸ› οΈ Development

Running Tests

# Add tests in tests/ directory
pytest tests/

Code Structure

  • segmentation.py: VideoSegmenter class using DeepLabv3
  • stabilization.py: MaskStabilizer with temporal filtering methods
  • metrics.py: Metric calculation functions
  • utils.py: Helper functions for video/image processing
  • main.py: FastAPI application with REST endpoints

πŸ“ Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

πŸ“„ License

This project is for educational purposes (Homework Assignment 5).

πŸ™ Acknowledgments

  • DeepLabv3 model from torchvision
  • FastAPI framework
  • OpenCV for video processing

πŸ“š References

πŸ’‘ Tips

  1. Video Selection: Start with short videos (5-10 seconds) for faster processing
  2. Class Selection: Choose "person" for best results with human subjects
  3. Window Size: Start with 5, increase for more smoothing
  4. Alpha Value: Try 0.3 for balanced exponential smoothing

πŸ› Troubleshooting

Issue: CUDA out of memory

# Solution: Use CPU mode or reduce batch size
segmenter = VideoSegmenter(device='cpu')

Issue: Video won't upload

  • Check file format (.mp4, .avi, .mov supported)
  • Ensure file size < 100MB
  • Verify video codec compatibility

Issue: Segmentation is slow

  • Use GPU if available
  • Reduce video resolution
  • Process fewer frames

πŸ“ž Support

For issues and questions, please open an issue on GitHub.


Status: βœ… Ready for deployment and testing

Version: 1.0.0

Last Updated: December 2025

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •