Skip to content

UBICO/SCIoT_python_client

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCIoT Python Client

Split Computing on IoT with Python Clients

Advanced split computing implementation for TinyML on IoT devices with intelligent offloading, variance detection, and resilient client-server communication.

Unit Tests Powered by UBICO

Overview

The SCIoT project provides tools to use Edge Impulse models in ESP32 devices and Python clients, using split computing techniques. This repository includes advanced features for adaptive offloading and system resilience.

Key Features

🎯 Intelligent Offloading

  • Dynamic layer-by-layer offloading decisions
  • Exponential Moving Average (EMA) time smoothing (α=0.2)
  • Network-aware split point selection
  • Support for 59-layer TFLite models (FOMO 96x96)

📊 Variance Detection System

  • Real-time inference time monitoring
  • Coefficient of Variation (CV) analysis with 15% threshold
  • Sliding window history (10 measurements per layer)
  • Automatic cascade propagation (layer i → layer i+1)
  • Triggers re-evaluation when performance changes

🔄 Local Inference Mode

  • Probabilistic forcing of device-local inference
  • Refreshes device measurements periodically
  • Configurable probability (0.0-1.0)
  • Returns special value -1 for all-device execution
  • Seamless client-server coordination

🛡️ Client Resilience

  • Graceful degradation to local-only mode
  • Connection error handling with 5-second timeouts
  • Automatic reconnection attempts
  • Continues operation when server unavailable
  • No crashes on network failures

🧪 Comprehensive Testing

  • 44 automated tests (39 core + 5 MQTT)
  • Interactive demonstration scripts
  • Unit, integration, and system tests
  • Connection resilience tests
  • 100% test pass rate

Publications

If you use this work, please consider citing:

  • F. Bove, S. Colli and L. Bedogni, "Performance Evaluation of Split Computing with TinyML on IoT Devices," 2024 IEEE 21st Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 2024, pp. 1-6, DOI Link.
  • F. Bove and L. Bedogni, "Smart Split: Leveraging TinyML and Split Computing for Efficient Edge AI," 2024 IEEE/ACM Symposium on Edge Computing (SEC), Rome, Italy, 2024, pp. 456-460, DOI Link.

Project Structure

SCIoT_python_client/
├── src/
│   └── server/
│       ├── edge/                    # Edge server initialization
│       ├── communication/           # HTTP, WebSocket, MQTT servers
│       │   ├── http_server.py      # FastAPI HTTP server
│       │   ├── request_handler.py  # Request processing + variance + local inference
│       │   └── websocket_server.py # WebSocket server
│       ├── models/                  # Model management and inference
│       │   └── model_manager.py    # Edge inference with variance tracking
│       ├── offloading_algo/         # Offloading decision algorithms
│       ├── device/                  # Device simulation
│       ├── statistics/              # Performance statistics
│       ├── variance_detector.py     # Variance detection system
│       ├── delay_simulator.py       # Network/computation delay simulation
│       └── settings.yaml            # Server configuration
│
├── server_client_light/
│   └── client/
│       ├── http_client.py           # Python HTTP client (main)
│       ├── websocket_client.py      # Python WebSocket client
│       ├── http_config.yaml         # HTTP client configuration
│       ├── websocket_config.yaml    # WebSocket client configuration
│       └── delay_simulator.py       # Client-side delay simulation
│
├── tests/
│   ├── test_variance_and_local_inference.py  # Core feature tests (27)
│   ├── test_client_resilience.py             # Connection handling (12)
│   ├── test_mqtt_client/                     # MQTT tests (5)
│   └── test_offloading_algo/                 # Offloading algorithm tests
│
├── test_variance_detection.py      # Interactive demo: variance detection
├── test_variance_cascading.py      # Interactive demo: cascading
│
└── Documentation/
    ├── VARIANCE_DETECTION.md
    ├── VARIANCE_DETECTION_IMPLEMENTATION.md
    ├── LOCAL_INFERENCE_MODE.md
    ├── LOCAL_INFERENCE_IMPLEMENTATION.md
    ├── CLIENT_SERVER_-1_SEMANTICS.md
    ├── DELAY_SIMULATION.md
    └── TEST_SUITE_SUMMARY.md

Installation

Prerequisites

  • Python 3.11+
  • TensorFlow 2.15.0
  • Docker (for MQTT broker)

Setup

Clone the repository:

git clone https://github.com/UBICO/SCIoT.git
cd SCIoT_python_client

Create virtual environment and install dependencies:

uv sync

Activate the virtual environment:

source .venv/bin/activate  # On macOS/Linux
# or
.venv\Scripts\activate     # On Windows

Model Setup

  • Save your Keras model as test_model.h5 in src/server/models/test/test_model/
  • Save your test image as test_image.png in src/server/models/test/test_model/pred_data/
  • Split the model: python3 src/server/models/model_split.py
  • Configure paths in src/server/commons.py

Configuration

Server Configuration (src/server/settings.yaml)

communication:
  http:
    host: 0.0.0.0
    port: 8000
    endpoints:
      registration: /api/registration
      device_input: /api/device_input
      offloading_layer: /api/offloading_layer
      device_inference_result: /api/device_inference_result

delay_simulation:
  computation:
    enabled: false
    type: gaussian
    mean: 0.001
    std_dev: 0.0002
  network:
    enabled: false
    type: gaussian
    mean: 0.020
    std_dev: 0.005

local_inference_mode:
  enabled: true
  probability: 0.1  # 10% of requests force local inference

Client Configuration (server_client_light/client/http_config.yaml)

client:
  device_id: "device_01"

http:
  server_host: "0.0.0.0"
  server_port: 8000

model:
  last_offloading_layer: 58
  
local_inference_mode:
  enabled: true
  probability: 0.1

Usage

Starting the Server

Activate the virtual environment:

source .venv/bin/activate

Start the MQTT broker (optional):

docker compose up

Run the edge server:

python src/server/edge/run_edge.py

Running the Client

In a separate terminal:

source .venv/bin/activate
python server_client_light/client/http_client.py

Client Behavior:

  • Connects to server and registers device
  • Sends image data
  • Receives offloading decision (or -1 for local-only)
  • Runs inference (split or local)
  • Sends results back to server
  • Continues operating if server becomes unavailable (graceful degradation to local-only mode)

Analytics Dashboard

View real-time statistics:

streamlit run src/server/web/webpage.py

Testing

Run All Tests

pytest tests/test_variance_and_local_inference.py tests/test_client_resilience.py tests/test_mqtt_client/ -v

Run Specific Test Suites

# Core features (variance, local inference, -1 handling)
pytest tests/test_variance_and_local_inference.py -v

# Connection resilience
pytest tests/test_client_resilience.py -v

# MQTT client
pytest tests/test_mqtt_client/ -v

Interactive Demos

# Variance detection demonstration
python test_variance_detection.py

# Cascade propagation demonstration
python test_variance_cascading.py

Advanced Features

Variance Detection

The system monitors inference time stability using Coefficient of Variation (CV):

CV = StdDev / Mean

If CV > 15% → Unstable → Trigger re-test

Cascading: When layer i shows variance, layer i+1 is automatically flagged for re-testing (since layer i's output is layer i+1's input).

See VARIANCE_DETECTION.md for details.

Local Inference Mode

Probabilistically forces device to run all layers locally:

  • Purpose: Refresh device inference times periodically
  • Configuration: enabled (true/false) + probability (0.0-1.0)
  • Mechanism: Server returns -1 instead of calculated offloading layer
  • Client Handling: -1 → converts to layer 58 (run all 59 layers locally)

See LOCAL_INFERENCE_MODE.md for details.

Delay Simulation

Simulate network and computation delays for testing:

delay_simulation:
  computation:
    enabled: true
    type: gaussian  # Options: static, gaussian, uniform, exponential
    mean: 0.001     # 1ms average
    std_dev: 0.0002 # 0.2ms variation
  network:
    enabled: true
    type: gaussian
    mean: 0.020     # 20ms average
    std_dev: 0.005  # 5ms variation

See DELAY_SIMULATION.md for details.

Performance Simulation

Run comprehensive multi-scenario simulations with automated analysis:

# Run all 9 predefined scenarios (duration: ~15 minutes)
python simulation_runner.py

# Results saved to: simulated_results/simulation_YYYYMMDD_HHMMSS/
#   - baseline_inference_results.csv
#   - network_delay_20ms_inference_results.csv
#   - computation_delay_5ms_inference_results.csv
#   - ... (one per scenario)

See SIMULATION_RUNNER_README.md for scenarios and configuration.

Results Analysis

Generate comprehensive graphs and statistics from simulation results:

# Analyze a simulation folder
python analyze_simulation.py simulated_results/simulation_YYYYMMDD_HHMMSS

# Generates in analysis/ subfolder:
#   - Device vs Edge time comparison plots
#   - Total inference time bar charts
#   - Throughput analysis
#   - Timing distribution boxplots
#   - Layer statistics
#   - Comprehensive comparison dashboard
#   - Summary statistics CSV

See ANALYSIS_README.md for detailed output descriptions and interpretation.

Client Resilience

Clients handle server unavailability gracefully:

  1. Connection timeout: 5 seconds on all requests
  2. Fallback behavior: Run all layers locally when server unreachable
  3. No crashes: All network errors caught and handled
  4. Auto-retry: Attempts reconnection on each request
  5. Continues operation: System never stops, even when isolated

Example output when server is down:

⚠ Registration failed (server unreachable): Connection refused
  → Continuing with local-only inference
⚠ Cannot reach server: Connection refused
  → Running all layers locally
✓ Inference complete (layers 0-58)

Documentation

Comprehensive documentation available:

System Architecture

┌─────────────┐         ┌──────────────┐         ┌─────────────┐
│   Device    │ ◄─────► │ Edge Server  │ ◄─────► │  Analytics  │
│   Client    │   HTTP  │  (FastAPI)   │         │  Dashboard  │
└─────────────┘         └──────────────┘         └─────────────┘
      │                        │
      │                        │
   Inference              Offloading
   (0 to N)              Algorithm +
                         Variance +
                         Local Mode
      │                        │
      ▼                        ▼
  Device                   Edge
  Results                Results
  (times)               (prediction)

Request Flow:

  1. Client sends image → Server
  2. Server returns offloading layer (or -1)
  3. Client runs inference up to layer
  4. Client sends results + times → Server
  5. Server tracks variance + updates times
  6. Server runs remaining layers (if needed)
  7. Server returns final prediction

Performance

  • Inference: 59 layers (FOMO 96x96)
  • Device time: ~19µs per layer average
  • Edge time: ~450-540µs per layer average
  • Network: Configurable latency simulation
  • Variance threshold: 15% CV
  • Refresh rate: Configurable (default 10% via local inference mode)

Troubleshooting

Server won't start

  • Check port 8000 is not in use
  • Verify TensorFlow is installed correctly
  • Check model files exist in correct paths

Client can't connect

  • Verify server is running
  • Check server_host and server_port in config
  • Note: Client will continue in local-only mode if server unavailable

Tests failing

  • Ensure virtual environment is activated
  • Run uv sync to update dependencies
  • Check Python version is 3.11+

Inference errors

  • Verify model is split correctly
  • Check layer dimensions match
  • Review logs in logs/ directory

Contributing

This is a research project. For questions or collaboration:

  • Open an issue on GitHub
  • Contact the UBICO research group
  • See publications for research context

License

See LICENSE file for details.


Last Updated: December 31, 2025
Status: ✅ All systems operational (44/44 tests passing)
Version: Python 3.11.11, TensorFlow 2.15.0

About

Fork of the project SCIoT with python client

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published