Fast clustering algorithms for Python with C++/Cython implementation and OpenMP parallel support.
- MeanShiftPP: Optimized mean shift clustering algorithm
- LocalShift: Local shift algorithm for 3D point cloud optimization in cryo-EM contexts
- GridShift: Grid-based clustering algorithm
- High Performance: C++ implementations with Cython wrappers and parallel processing
- Scikit-learn Compatible: Familiar API with both class and functional interfaces
# Basic installation
uv sync
# Install with benchmark dependencies
uv sync --all-extras- Python ≥ 3.10
- NumPy ≥ 2.0
- C++ compiler with C++17 support
- OpenMP support
import numpy as np
from shiftclustering import meanshiftpp, localshift, gridshift
# Generate sample data
X = np.random.randn(1000, 2).astype(np.float32)
# MeanShift++ clustering
labels = meanshiftpp(X, bandwidth=1.0)
# GridShift clustering
labels = gridshift(X, bandwidth=1.0)
# For 3D point cloud optimization (cryo-EM)
atoms = np.random.rand(100, 3) * 50
density_map = np.random.rand(50, 50, 50)
optimized_atoms = localshift(atoms, density_map, fmaxd=5.0, fsiv=0.1)from shiftclustering import MeanShiftPP, LocalShift, GridShift
# MeanShiftPP
ms = MeanShiftPP(bandwidth=1.0, max_iter=300)
labels = ms.fit_predict(X)
# LocalShift for cryo-EM optimization
ls = LocalShift(fmaxd=5.0, fsiv=0.1, n_steps=100)
optimized_atoms = ls.fit_predict(atoms, density_map)Fast implementation of mean shift clustering with optimized binning strategy.
Grid-based clustering algorithm for large datasets.
3D point cloud optimization algorithm for cryo-EM structure refinement. Iteratively shifts points towards local maxima in density maps using Gaussian kernel weighting.
Citation Required: If you use LocalShift in your research, please cite:
Terashi, Genki, and Daisuke Kihara. "De novo main-chain modeling for EM maps using MAINMAST." Nature Communications 9, no. 1 (2018): 1618.
The package provides significant speedups over pure Python implementations:
- LocalShift: ~10x faster than Numba implementation
- MeanShiftPP: ~11x faster than sklearn.cluster.MeanShift
- GridShift: ~80x faster than sklearn.cluster.MeanShift
See benchmark/ for detailed performance comparisons.
shiftclustering/
├── include/ # C++ header files
├── src/ # Cython implementation files
└── __init__.py # Main module
benchmark/ # Performance benchmarks
# Development installation
uv sync --all-extras
# Reinstall with rebuild
uv sync --all-extras --reinstallGPL-3.0 License. See LICENSE for details.
This project builds upon:
Please cite the original papers when using these algorithms.