ABeeC: Artificial Bee Colony Hybrid Suite

Important

If you use ABeeC in your research, please cite:

@mastersthesis{litovchenko2025,
  author = {Nikita Litovchenko},
  title = {Hybrid Asynchronous Artificial Bee Colony Algorithm for Constrained Optimization in HPC},
  school = {Politecnico di Milano},
  year = {2025},
  note = {Available at \url{https://github.com/litovn/ABeeC}}
}

@inproceedings{nikita2025abc,
  title={An Artificial Bee Colony algorithm with Machine Learning for Constrained Optimization in HPC},
  author={Sala, Roberto and Litovchenko, Nikita and Gadioli, Davide and Palermo, Gianluca and Ardagna, Danilo},
  booktitle={MASCOTS},
  pages={1--9},
  year={2025},
  organization={IEEE}
}

Overview

ABeeC (Artificial Bee Colony - Hybrid Suite) is an optimization framework that extends the classical Artificial Bee Colony algorithm, for solving constrained, discrete optimization problems in high-performance computing (HPC) environments. This project is the result of a Master's thesis in Computer Science and Engineering at Politecnico di Milano.

The framework addresses critical limitations of traditional optimization approaches in HPC by introducing asynchronous event-driven execution, machine learning-guided search, and hybrid Bayesian optimization integration. Unlike standard ABC implementations designed for continuous unconstrained problems, ABeeC is engineered for real-world HPC scenarios involving expensive black-box functions, discrete configuration spaces, and strict computational constraints.

Key Features

✅ Three Advanced Variants: ABC-MLCO, ABC-AMLO, and hybrid BO integration
✅ Asynchronous Execution: Event-driven architecture eliminating synchronization barriers
✅ Black-Box Constraint Handling: Intelligent constraint management without analytical models
✅ Memory-Based Optimization: Local and global memory mechanisms with Bloom Filter caching
✅ ML-Guided Search: Surrogate models (Ridge Regression, Random Forest, Neural Networks) for directed exploration
✅ Lévy Flight Exploration: Adaptive long-range jumps to escape local optima
✅ Bayesian Optimization Integration: Hybrid approach combining global and local search

Algorithm Variants

ABC-MLCO: Machine Learning for Constrained Optimization

Purpose: Enhance the original ABC algorithm for black-box constrained optimization in discrete domains.

Innovations:

Feasibility-aware fitness function steering search away from infeasible regions
ML surrogate models trained on individual bee histories for exploitation
Lévy flight distribution for enhanced exploration with adaptive probability
Memory-based duplicate detection to reduce redundant evaluations from ~60% to ~10%

ABC-AMLO: Asynchronous Machine Learning and Optimization

Purpose: Maximize computational resource utilization through event-driven execution.

Innovations:

Eliminates three synchronization barriers inherent in synchronous ABC
Event-driven manager using priority queues for continuous task scheduling
Dual memory strategy: local memory for focused search + global memory for collaborative knowledge
Bloom Filter-based cache for efficient duplicate detection in highly parallel settings

Hybrid BO Integration

Purpose: Combine ABC's global exploration with Bayesian Optimization's local refinement.

Strategy: Probabilistically triggered BO activation increasing over optimization stages

Synergy: Prevents BO premature convergence while accelerating ABC's late-stage refinement.

Real-World Validation:

The framework was validated on a molecular docking application from the EXSCALATE virtual screening platform, addressing a production HPC auto-tuning scenario with:

Problem Scale: 8-dimensional discrete configuration space
Constraint Types: Execution time limits, minimum accuracy thresholds
Evaluation Cost: ~9 minutes per configuration evaluation

Research Contributions

This work addresses several open problems in swarm intelligence and HPC optimization:

First Asynchronous ABC Implementation for HPC: ABC-AMLO replaces synchronous iteration model with event-driven execution, achieving significant speedups
Memory-Based Discrete Optimization: Adapts ABC for discrete domains with memory-guided duplicate avoidance
Constraint-Aware Surrogate Models: Integrates ML prediction of black-box constraints to steer search toward feasible regions
Hybrid Global-Local Search: Seamlessly combines ABC's robust global exploration with Bayesian Optimization's efficient local refinement
Production HPC Validation: Real-world case study on molecular docking application demonstrates practical impact

Installation

Dependencies

Core packages:

numpy - Efficient numerical operations
scikit-learn - ML models and utilities
scipy - Scientific computing functions
pandas - Data manipulation and analysis
xgboost - Gradient boosting for surrogate models
matplotlib - Visualization and result plotting

See requirements.txt for complete specifications.

Setup

# Clone the repository
git clone https://github.com/litovn/ABeeC.git
cd ABeeC

# Install dependencies
pip install -r requirements.txt

Project Structure

ABeeC/
├── README.md
├── requirements.txt
├── config.json                 # Configuration template
├── runabeec.py                 # Main entry point
│
├── src/
│   ├── coreabc.py              # Core ABC implementation (Beehive, Bee classes)
│   ├── simulate.py             # Evaluation and simulation module
│   ├── bomanager.py            # Bayesian Optimization manager
│   │
│   ├── dMALIBOO/               # Bayesian Optimization framework
│   │   ├── d_maliboo.py
│   │   └── search_space.py
│   │
│   ├── strategy/               # Algorithm components
│   │   ├── levy.py             # Lévy flight implementation
│   │   ├── memory.py           # Memory management (local/global)
│   │   ├── surrogate.py        # ML surrogate model training
│   │   └── bo.py               # BO integration logic
│   │
│   └── utils/                  # Utility functions
│       ├── config.py           # Configuration parsing
│       ├── dataset.py          # Data loading and preprocessing
│       ├── cache.py            # Bloom Filter cache implementation
│       ├── logging.py          # Activity logging for analysis
│       └── plotting.py         # Result visualization
│
└── tests/                      # Unit and integration tests
    ├── test_core.py
    └── test_algorithms.py

Architecture and Design

Core Classes

Beehive: Population manager maintaining all bees and orchestrating optimization phases

Manages bee population lifecycle
Coordinates role transitions (employed → onlooker → scout)
Tracks global best solution and convergence metrics

Bee: Individual agent representing a candidate solution

Maintains local memory of evaluated configurations
Trains surrogate models on exploration history
Computes fitness based on objective and constraint functions
Tracks trial counter for stagnation detection

Event-Driven Manager (ABC-AMLO only)

Priority queue of bee tasks ordered by completion time
Processes results and schedules next operations
Manages transition between optimization phases
Coordinates cache and duplicate detection

Bloom Filter Cache (ABC-AMLO only)

Probabilistic data structure for efficient duplicate detection
Configurable false-positive rate (parameter ε)
Reduces memory overhead compared to storing all evaluations

Algorithm Flow

Initialization
    ↓
[ABC-AMLO Event-Driven Loop]
    ├─ Event Manager: Select next bee to process
    ├─ Phase Transition: Based on previous role (Scout→Employed→Onlooker→Scout)
    │
    ├─ Employed Phase
    │   ├─ Train surrogate model on bee's memory
    │   ├─ Generate candidate via ML-guided neighborhood search
    │   ├─ Evaluate objective and constraint functions
    │   └─ Update best position and trial counter
    │
    ├─ Onlooker Phase
    │   ├─ Compute roulette wheel selection probabilities
    │   ├─ Probabilistically choose: Lévy flight OR roulette selection
    │   ├─ Generate candidate around selected bee
    │   └─ Update position and trial counter
    │
    ├─ Scout Phase
    │   ├─ Check if trial limit exceeded
    │   ├─ Train constraint prediction model
    │   ├─ Sample new position from feasible region
    │   └─ Reset trial counter
    │
    ├─ Bayesian Optimization (probabilistic trigger)
    │   ├─ Check activation probability based on stage
    │   ├─ Train Gaussian Process surrogate
    │   ├─ Compute acquisition function
    │   └─ Perform local refinement steps
    │
    └─ Cache Management
        ├─ Check Bloom Filter for duplicates
        ├─ Avoid redundant evaluations
        └─ Update global memory if using global strategy

Termination Condition Met
    ↓
Return Best Feasible Solution

Key Parameters and Tuning

Parameter	Default	Description
`N`	50	Number of bees (population size)
`T`	50-100	Maximum iterations
`limit`	10	Trial threshold before scout phase
`s_levy`	0.1	Lévy flight step-size scaling
`levy_β`	1.5	Heavy-tailed exponent (1.0-2.0)
`H_min`	10	Minimum history for surrogate training
`H_max`	200	Maximum history size per bee
`memory_type`	"global"	"local" for focused search, "global" for collaborative
`surrogate_model`	"ridge"	"ridge", "random_forest", or "neural_network"
`use_bayesian_optimization`	true	Enable hybrid BO integration

Tuning Guidelines

Exploration vs. Exploitation: Increase s_levy or decrease limit to favor exploration
Convergence Speed: Use "global" memory for faster initial convergence; switch to "local" for fine-tuning
Large Search Spaces: Increase N (population size) and enable BO for late-stage refinement
Expensive Evaluations: Use "ridge" regression (faster training) on limited data; switch to "random_forest" as data accumulates
Tight Constraints: Lower levy_β to favor smaller steps and feasibility preservation

Performance Comparison

Against Baselines

Algorithm	LiGen MAPR	Feasibility	Convergence Speed
OpenTuner	baseline	28%	baseline
ABC (original)	-2793%	20%	baseline
ABC-MLCO	-2851%	50%	-50% vs ABC
ABC-AMLO	-2882%	54%	3.6× faster

Lower MAPR is better (negative values show improvement)

Future Work and Extensions

Potential areas for extension:

Multi-Objective Optimization: Extend to Pareto-front discovery for trade-off exploration
Distributed Computing: Scale to multi-node clusters with MPI integration
Adaptive Hyperparameters: Self-tuning of population size, Lévy parameters
GPU Acceleration: Surrogate model training and acquisition function optimization on CUDA
Transfer Learning: Warm-start from similar optimization problems
Real-Time Control: Applications to dynamic systems and online optimization

Limitations and Known Issues

Current Limitations:
- Surrogate models limited to ~500 sample points per bee for memory efficiency
- Bloom Filter false-positive rate increases with cache size; tunable via ε parameter
- Bayesian Optimization integration assumes relatively small numbers of BO steps (~1-3)

License

This project is licensed under the GPL-3.0 license - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
continuous_examples		continuous_examples
scripts		scripts
src		src
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
abeec.png		abeec.png
config.json		config.json
requirements.txt		requirements.txt
run_abeec.py		run_abeec.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ABeeC: Artificial Bee Colony Hybrid Suite

Overview

Key Features

Algorithm Variants

ABC-MLCO: Machine Learning for Constrained Optimization

ABC-AMLO: Asynchronous Machine Learning and Optimization

Hybrid BO Integration

Real-World Validation:

Research Contributions

Installation

Dependencies

Setup

Project Structure

Architecture and Design

Core Classes

Algorithm Flow

Key Parameters and Tuning

Tuning Guidelines

Performance Comparison

Against Baselines

Future Work and Extensions

Limitations and Known Issues

License

About

Uh oh!

Languages

License

litovn/ABeeC

Folders and files

Latest commit

History

Repository files navigation

ABeeC: Artificial Bee Colony Hybrid Suite

Overview

Key Features

Algorithm Variants

ABC-MLCO: Machine Learning for Constrained Optimization

ABC-AMLO: Asynchronous Machine Learning and Optimization

Hybrid BO Integration

Real-World Validation:

Research Contributions

Installation

Dependencies

Setup

Project Structure

Architecture and Design

Core Classes

Algorithm Flow

Key Parameters and Tuning

Tuning Guidelines

Performance Comparison

Against Baselines

Future Work and Extensions

Limitations and Known Issues

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages