From db22649491c5dfe1f6bab2f016bb6b76d1e6255e Mon Sep 17 00:00:00 2001 From: "Schlimbach, Frank" Date: Mon, 23 Feb 2026 04:52:14 -0800 Subject: [PATCH 1/2] correcting/simplifying mlp-mpi readme --- examples/mlp-mpi/README.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/examples/mlp-mpi/README.md b/examples/mlp-mpi/README.md index 5de6957..84668d8 100644 --- a/examples/mlp-mpi/README.md +++ b/examples/mlp-mpi/README.md @@ -2,19 +2,15 @@ ## Prerequisites -You need mpi4py in your python env. If you are not using OpenMPI (e.g. not MPICH-like like Intel MPI) you need to modify the first line in mlp_weight_stationary.mlir by replacing `"MPI:Implementation" = "MPICH"` with `"MPI:Implementation" = "OpenMPI"`. +You need mpi4py in your python env. The default MPI implementation is MPICH. + +For OpenMPI, change `"MPI:Implementation" = "MPICH"` to `"MPI:Implementation" = "OpenMPI"` in the first line of mlp_weight_stationary.mlir. ## Running ``` -export MLIR_DIR= export MPI_DIR= -export LH_DIR= -PYTHONPATH=$LH_DIR:$MLIR_DIR/tools/mlir/python_packages/mlir_core \ - mpirun -n \ - python -u mlp-mpi.py \ - --mpilib $MPI_DIR/libmpi.so \ - --utils_dir $MLIR_DIR/lib \ - -s 64 64 64 +uv sync --extra runtime_mpich +uv run mpirun -n python -u mlp-mpi.py --mpilib $MPI_DIR/lib/libmpi.so ``` Run with `--help` for more options. From 75cea34d411e751b870e71ec6915182b83cf41ea Mon Sep 17 00:00:00 2001 From: "Schlimbach, Frank" Date: Mon, 23 Feb 2026 05:01:54 -0800 Subject: [PATCH 2/2] More readme --- examples/mlp-mpi/README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/examples/mlp-mpi/README.md b/examples/mlp-mpi/README.md index 84668d8..ef2f9d8 100644 --- a/examples/mlp-mpi/README.md +++ b/examples/mlp-mpi/README.md @@ -1,4 +1,10 @@ -# Computing a MLP sigmoid(A@B)@C on multiple ranks using MPI through MLIR +# Computing an MLP sigmoid(A@B)@C on multiple ranks using MPI through MLIR + +This example shows how MLIR's sharding infrastructure can be used to distribute data and computation across multiple nodes with non-shared memory. + +Currently, only the lower part of the sharding pipeline is used: `shard-partition`, `convert-shard-to-mpi`, and lowering to LLVM. Therefore, the ingress MLIR is fully annotated. + +The example implements a "single MLP", following a 1D/2D weight-stationary partition strategy as described in figures 2a and 2b of https://arxiv.org/pdf/2211.05102. ## Prerequisites