Important
If you use ABeeC in your research, please cite:
@mastersthesis{litovchenko2025,
author = {Nikita Litovchenko},
title = {Hybrid Asynchronous Artificial Bee Colony Algorithm for Constrained Optimization in HPC},
school = {Politecnico di Milano},
year = {2025},
note = {Available at \url{https://github.com/litovn/ABeeC}}
}@inproceedings{nikita2025abc,
title={An Artificial Bee Colony algorithm with Machine Learning for Constrained Optimization in HPC},
author={Sala, Roberto and Litovchenko, Nikita and Gadioli, Davide and Palermo, Gianluca and Ardagna, Danilo},
booktitle={MASCOTS},
pages={1--9},
year={2025},
organization={IEEE}
}ABeeC (Artificial Bee Colony - Hybrid Suite) is an optimization framework that extends the classical Artificial Bee Colony algorithm, for solving constrained, discrete optimization problems in high-performance computing (HPC) environments. This project is the result of a Master's thesis in Computer Science and Engineering at Politecnico di Milano.
The framework addresses critical limitations of traditional optimization approaches in HPC by introducing asynchronous event-driven execution, machine learning-guided search, and hybrid Bayesian optimization integration. Unlike standard ABC implementations designed for continuous unconstrained problems, ABeeC is engineered for real-world HPC scenarios involving expensive black-box functions, discrete configuration spaces, and strict computational constraints.
- ✅ Three Advanced Variants: ABC-MLCO, ABC-AMLO, and hybrid BO integration
- ✅ Asynchronous Execution: Event-driven architecture eliminating synchronization barriers
- ✅ Black-Box Constraint Handling: Intelligent constraint management without analytical models
- ✅ Memory-Based Optimization: Local and global memory mechanisms with Bloom Filter caching
- ✅ ML-Guided Search: Surrogate models (Ridge Regression, Random Forest, Neural Networks) for directed exploration
- ✅ Lévy Flight Exploration: Adaptive long-range jumps to escape local optima
- ✅ Bayesian Optimization Integration: Hybrid approach combining global and local search
Purpose: Enhance the original ABC algorithm for black-box constrained optimization in discrete domains.
Innovations:
- Feasibility-aware fitness function steering search away from infeasible regions
- ML surrogate models trained on individual bee histories for exploitation
- Lévy flight distribution for enhanced exploration with adaptive probability
- Memory-based duplicate detection to reduce redundant evaluations from ~60% to ~10%
Purpose: Maximize computational resource utilization through event-driven execution.
Innovations:
- Eliminates three synchronization barriers inherent in synchronous ABC
- Event-driven manager using priority queues for continuous task scheduling
- Dual memory strategy: local memory for focused search + global memory for collaborative knowledge
- Bloom Filter-based cache for efficient duplicate detection in highly parallel settings
Purpose: Combine ABC's global exploration with Bayesian Optimization's local refinement.
Strategy: Probabilistically triggered BO activation increasing over optimization stages
Synergy: Prevents BO premature convergence while accelerating ABC's late-stage refinement.
The framework was validated on a molecular docking application from the EXSCALATE virtual screening platform, addressing a production HPC auto-tuning scenario with:
- Problem Scale: 8-dimensional discrete configuration space
- Constraint Types: Execution time limits, minimum accuracy thresholds
- Evaluation Cost: ~9 minutes per configuration evaluation
This work addresses several open problems in swarm intelligence and HPC optimization:
- First Asynchronous ABC Implementation for HPC: ABC-AMLO replaces synchronous iteration model with event-driven execution, achieving significant speedups
- Memory-Based Discrete Optimization: Adapts ABC for discrete domains with memory-guided duplicate avoidance
- Constraint-Aware Surrogate Models: Integrates ML prediction of black-box constraints to steer search toward feasible regions
- Hybrid Global-Local Search: Seamlessly combines ABC's robust global exploration with Bayesian Optimization's efficient local refinement
- Production HPC Validation: Real-world case study on molecular docking application demonstrates practical impact
Core packages:
numpy- Efficient numerical operationsscikit-learn- ML models and utilitiesscipy- Scientific computing functionspandas- Data manipulation and analysisxgboost- Gradient boosting for surrogate modelsmatplotlib- Visualization and result plotting
See requirements.txt for complete specifications.
# Clone the repository
git clone https://github.com/litovn/ABeeC.git
cd ABeeC
# Install dependencies
pip install -r requirements.txtABeeC/
├── README.md
├── requirements.txt
├── config.json # Configuration template
├── runabeec.py # Main entry point
│
├── src/
│ ├── coreabc.py # Core ABC implementation (Beehive, Bee classes)
│ ├── simulate.py # Evaluation and simulation module
│ ├── bomanager.py # Bayesian Optimization manager
│ │
│ ├── dMALIBOO/ # Bayesian Optimization framework
│ │ ├── d_maliboo.py
│ │ └── search_space.py
│ │
│ ├── strategy/ # Algorithm components
│ │ ├── levy.py # Lévy flight implementation
│ │ ├── memory.py # Memory management (local/global)
│ │ ├── surrogate.py # ML surrogate model training
│ │ └── bo.py # BO integration logic
│ │
│ └── utils/ # Utility functions
│ ├── config.py # Configuration parsing
│ ├── dataset.py # Data loading and preprocessing
│ ├── cache.py # Bloom Filter cache implementation
│ ├── logging.py # Activity logging for analysis
│ └── plotting.py # Result visualization
│
└── tests/ # Unit and integration tests
├── test_core.py
└── test_algorithms.py
Beehive: Population manager maintaining all bees and orchestrating optimization phases
- Manages bee population lifecycle
- Coordinates role transitions (employed → onlooker → scout)
- Tracks global best solution and convergence metrics
Bee: Individual agent representing a candidate solution
- Maintains local memory of evaluated configurations
- Trains surrogate models on exploration history
- Computes fitness based on objective and constraint functions
- Tracks trial counter for stagnation detection
Event-Driven Manager (ABC-AMLO only)
- Priority queue of bee tasks ordered by completion time
- Processes results and schedules next operations
- Manages transition between optimization phases
- Coordinates cache and duplicate detection
Bloom Filter Cache (ABC-AMLO only)
- Probabilistic data structure for efficient duplicate detection
- Configurable false-positive rate (parameter ε)
- Reduces memory overhead compared to storing all evaluations
Initialization
↓
[ABC-AMLO Event-Driven Loop]
├─ Event Manager: Select next bee to process
├─ Phase Transition: Based on previous role (Scout→Employed→Onlooker→Scout)
│
├─ Employed Phase
│ ├─ Train surrogate model on bee's memory
│ ├─ Generate candidate via ML-guided neighborhood search
│ ├─ Evaluate objective and constraint functions
│ └─ Update best position and trial counter
│
├─ Onlooker Phase
│ ├─ Compute roulette wheel selection probabilities
│ ├─ Probabilistically choose: Lévy flight OR roulette selection
│ ├─ Generate candidate around selected bee
│ └─ Update position and trial counter
│
├─ Scout Phase
│ ├─ Check if trial limit exceeded
│ ├─ Train constraint prediction model
│ ├─ Sample new position from feasible region
│ └─ Reset trial counter
│
├─ Bayesian Optimization (probabilistic trigger)
│ ├─ Check activation probability based on stage
│ ├─ Train Gaussian Process surrogate
│ ├─ Compute acquisition function
│ └─ Perform local refinement steps
│
└─ Cache Management
├─ Check Bloom Filter for duplicates
├─ Avoid redundant evaluations
└─ Update global memory if using global strategy
Termination Condition Met
↓
Return Best Feasible Solution
| Parameter | Default | Description |
|---|---|---|
N |
50 | Number of bees (population size) |
T |
50-100 | Maximum iterations |
limit |
10 | Trial threshold before scout phase |
s_levy |
0.1 | Lévy flight step-size scaling |
levy_β |
1.5 | Heavy-tailed exponent (1.0-2.0) |
H_min |
10 | Minimum history for surrogate training |
H_max |
200 | Maximum history size per bee |
memory_type |
"global" | "local" for focused search, "global" for collaborative |
surrogate_model |
"ridge" | "ridge", "random_forest", or "neural_network" |
use_bayesian_optimization |
true | Enable hybrid BO integration |
- Exploration vs. Exploitation: Increase
s_levyor decreaselimitto favor exploration - Convergence Speed: Use "global" memory for faster initial convergence; switch to "local" for fine-tuning
- Large Search Spaces: Increase
N(population size) and enable BO for late-stage refinement - Expensive Evaluations: Use "ridge" regression (faster training) on limited data; switch to "random_forest" as data accumulates
- Tight Constraints: Lower
levy_βto favor smaller steps and feasibility preservation
| Algorithm | LiGen MAPR | Feasibility | Convergence Speed |
|---|---|---|---|
| OpenTuner | baseline | 28% | baseline |
| ABC (original) | -2793% | 20% | baseline |
| ABC-MLCO | -2851% | 50% | -50% vs ABC |
| ABC-AMLO | -2882% | 54% | 3.6× faster |
Lower MAPR is better (negative values show improvement)
Potential areas for extension:
- Multi-Objective Optimization: Extend to Pareto-front discovery for trade-off exploration
- Distributed Computing: Scale to multi-node clusters with MPI integration
- Adaptive Hyperparameters: Self-tuning of population size, Lévy parameters
- GPU Acceleration: Surrogate model training and acquisition function optimization on CUDA
- Transfer Learning: Warm-start from similar optimization problems
- Real-Time Control: Applications to dynamic systems and online optimization
- Current Limitations:
- Surrogate models limited to ~500 sample points per bee for memory efficiency
- Bloom Filter false-positive rate increases with cache size; tunable via ε parameter
- Bayesian Optimization integration assumes relatively small numbers of BO steps (~1-3)
This project is licensed under the GPL-3.0 license - see the LICENSE file for details.