Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 12 additions & 10 deletions examples/mlp-mpi/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
# Computing a MLP sigmoid(A@B)@C on multiple ranks using MPI through MLIR
# Computing an MLP sigmoid(A@B)@C on multiple ranks using MPI through MLIR

This example shows how MLIR's sharding infrastructure can be used to distribute data and computation across multiple nodes with non-shared memory.

Currently, only the lower part of the sharding pipeline is used: `shard-partition`, `convert-shard-to-mpi`, and lowering to LLVM. Therefore, the ingress MLIR is fully annotated.

The example implements a "single MLP", following a 1D/2D weight-stationary partition strategy as described in figures 2a and 2b of https://arxiv.org/pdf/2211.05102.

## Prerequisites

You need mpi4py in your python env. If you are not using OpenMPI (e.g. not MPICH-like like Intel MPI) you need to modify the first line in mlp_weight_stationary.mlir by replacing `"MPI:Implementation" = "MPICH"` with `"MPI:Implementation" = "OpenMPI"`.
You need mpi4py in your python env. The default MPI implementation is MPICH.

For OpenMPI, change `"MPI:Implementation" = "MPICH"` to `"MPI:Implementation" = "OpenMPI"` in the first line of mlp_weight_stationary.mlir.

## Running

```
export MLIR_DIR=<path_to_mlir_build_dir>
export MPI_DIR=<path_to_mpi_install>
export LH_DIR=<path_to_lighthouse>
PYTHONPATH=$LH_DIR:$MLIR_DIR/tools/mlir/python_packages/mlir_core \
mpirun -n <nRanks> \
python -u mlp-mpi.py \
--mpilib $MPI_DIR/libmpi.so \
--utils_dir $MLIR_DIR/lib \
-s 64 64 64
uv sync --extra runtime_mpich
uv run mpirun -n <nRanks> python -u mlp-mpi.py --mpilib $MPI_DIR/lib/libmpi.so
```
Run with `--help` for more options.