[DoNotCommit] Add support for building Codegen example with an existi…#3
Open
nicolasvasilache wants to merge 1 commit intoNodLabs:masterfrom
nicolasvasilache:standalone
Open
[DoNotCommit] Add support for building Codegen example with an existi…#3nicolasvasilache wants to merge 1 commit intoNodLabs:masterfrom nicolasvasilache:standalone
nicolasvasilache wants to merge 1 commit intoNodLabs:masterfrom
nicolasvasilache:standalone
Conversation
…ng MLIR
Prerequisites:
==============
First, `export MLIR_SOURCE_DIR=...`
```
(mkdir -p ${MLIR_SOURCE_DIR}/../build && \
cd ${MLIR_SOURCE_DIR}/../build && \
cmake -G Ninja ../llvm -DLLVM_ENABLE_PROJECTS="mlir" -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=1 -DMLIR_LINK_MLIR_DYLIB=1 -DLLVM_BUILD_EXAMPLES=OFF -DLLVM_TARGETS_TO_BUILD="X86" \
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && \
cmake --build . --target MLIR check-mlir)
```
Codegen:
========
```
MLIR_DIR=${MLIR_SOURCE_DIR}/../build cmake -GNinja -DCMAKE_CXX_COMPILER=clang++-11 -DCMAKE_C_COMPILER=clang-11 \
-DMLIR_SOURCE=${MLIR_SOURCE_DIR} -DUSE_MKL=OFF -DMLIR_BUILD=${MLIR_SOURCE_DIR}/../build/lib -B build ./Codegen/matmul && \
cmake --build build
```
Benchmark:
==========
```
rm -f build/matmul_* && cmake --build build --target matmul-compile; \
for f in $(find build/ -maxdepth 1 -executable -type f | sort --version-sort); do $f; done; \
ls *out | sort --version-sort | xargs tail -n 1
```
Results (on my machine, peak ~96GFlops/s DP):
=============================================
==> matmul_18x32x96_mlir_perf.out <==
32.44 GFLOPS
==> matmul_24x64x96_mlir_perf.out <==
33.86 GFLOPS
==> matmul_24x64x512_mlir_perf.out <==
40.66 GFLOPS
==> matmul_48x64x128_mlir_perf.out <==
42.69 GFLOPS
==> matmul_192x64x128_mlir_perf.out <==
41.60 GFLOPS
==> matmul_192x128x128_mlir_perf.out <==
36.87 GFLOPS
==> matmul_192x256x256_mlir_perf.out <==
34.32 GFLOPS
==> matmul_384x256x256_mlir_perf.out <==
35.13 GFLOPS
==> matmul_480x512x256_mlir_perf.out <==
30.80 GFLOPS
==> matmul_1020x1152x1152_mlir_perf.out <==
12.49 GFLOPS
==> matmul_1024x1024x1024_mlir_perf.out <==
35.26 GFLOPS
==> matmul_2304x2304x2560_mlir_perf.out <==
24.42 GFLOPS
Notes:
======
1. ODM numbers were using F32, good register/tile sizes need to be explored for F64.
2. Fixed some issues preventing AVX512, may be a few more things needed re compiler flags.
3. There seems to be some core MLIR regressions: manually trying different tiles sizes can create code that segfaults.
4. MLIR OSS lacks hoistings that were used internally, linalg on tensors is a better abstraction for this but still WIP.
5. MLIR OSS lacks full/partial splitting + outlining strategies that were used internally.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ng MLIR