Lower Affine Dialect to Nerua Dialect #31

ShangkunLi · 2025-06-13T11:11:47Z

In this PR, the following tasks are finished:

Define 4 neura operations to support affine loops

neura.loop_iter: Used for loop iteration organization, you can treat it like a combination of phi-add-cmp in llvm
neura.loop_control: Used for loop iteration and control, you can treat it like a combination of phi-add-cmp-br in llvm
neura.load_indexed: An operation used for fast array memory access, you can treat it like a combination of gep-load in llvm
neura.store_indexed: An operation used for fast array memory access, you can treat it like a combination of gep-store in llvm

A pass that can lower some simple nested affine loops into neura dialect. Below is a simple example:
Source Affine mlir generated by Polygeist from .cpp

module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<!llvm.ptr<272>, dense<64> : vector<4xi32>>, #dlti.dl_entry<!llvm.ptr<271>, dense<32> : vector<4xi32>>, #dlti.dl_entry<i64, dense<64> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr<270>, dense<32> : vector<4xi32>>, #dlti.dl_entry<f80, dense<128> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr, dense<64> : vector<4xi32>>, #dlti.dl_entry<i1, dense<8> : vector<2xi32>>, #dlti.dl_entry<i8, dense<8> : vector<2xi32>>, #dlti.dl_entry<i16, dense<16> : vector<2xi32>>, #dlti.dl_entry<i32, dense<32> : vector<2xi32>>, #dlti.dl_entry<f16, dense<16> : vector<2xi32>>, #dlti.dl_entry<f64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f128, dense<128> : vector<2xi32>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i32>>, llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu", "polygeist.target-cpu" = "x86-64", "polygeist.target-features" = "+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87", "polygeist.tune-cpu" = "generic"} {
  memref.global @C : memref<100xf32> = uninitialized
  memref.global @A : memref<100xf32> = uninitialized
  func.func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} {
    %cst = arith.constant 1.000000e+01 : f32
    %c0_i32 = arith.constant 0 : i32
    %0 = memref.get_global @A : memref<100xf32>
    %1 = memref.get_global @C : memref<100xf32>
    affine.for %arg0 = 0 to 100 {
      %2 = affine.load %0[%arg0] : memref<100xf32>
      %3 = arith.mulf %2, %cst : f32
      affine.store %3, %1[%arg0] : memref<100xf32>
    }
    return %c0_i32 : i32
  }
}

Lowered Neura mlir by applying this pass:

module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<!llvm.ptr<272>, dense<64> : vector<4xi32>>, #dlti.dl_entry<!llvm.ptr<271>, dense<32> : vector<4xi32>>, #dlti.dl_entry<i64, dense<64> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr<270>, dense<32> : vector<4xi32>>, #dlti.dl_entry<f80, dense<128> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr, dense<64> : vector<4xi32>>, #dlti.dl_entry<i1, dense<8> : vector<2xi32>>, #dlti.dl_entry<i8, dense<8> : vector<2xi32>>, #dlti.dl_entry<i16, dense<16> : vector<2xi32>>, #dlti.dl_entry<i32, dense<32> : vector<2xi32>>, #dlti.dl_entry<f16, dense<16> : vector<2xi32>>, #dlti.dl_entry<f64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f128, dense<128> : vector<2xi32>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i32>>, llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu", "polygeist.target-cpu" = "x86-64", "polygeist.target-features" = "+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87", "polygeist.tune-cpu" = "generic"} {
  memref.global @C : memref<100xf32> = uninitialized
  memref.global @A : memref<100xf32> = uninitialized
  func.func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} {
    %cst = arith.constant 1.000000e+01 : f32
    %c0_i32 = arith.constant 0 : i32
    %0 = memref.get_global @A : memref<100xf32>
    %1 = memref.get_global @C : memref<100xf32>
    %2 = neura.constant {value = 0 : index} : index
    %3 = neura.constant {value = 100 : index} : index
    %4 = neura.constant {value = 1 : index} : index
    neura.br %2 : index to ^bb2
  ^bb1:  // pred: ^bb4
    return %c0_i32 : i32
  ^bb2(%5: index):  // 2 preds: ^bb0, ^bb3
    neura.loop_control current_index : %5, step : %4, bound : %3, loop_type : "lt" then ^bb3 else ^bb4
  ^bb3(%6: index):  // pred: ^bb2
    %7 = neura.load_indexed memref<100xf32> %0[%6] : f32
    %8 = arith.mulf %7, %cst : f32
    neura.store_indexed %8 to memref<100xf32> %1[%6] : f32
    neura.br %6 : index to ^bb2
  ^bb4:  // pred: ^bb2
    neura.br :  to ^bb1
  }
}

To Do:

A neura compiler used for further development
A DFG construction pass used in neura compiler
A memory-aware mapping pass that fully utilizes the computation resources

tancheng

Done half of the review, will continue later..

include/CMakeLists.txt

include/Compiler/CMakeLists.txt

include/Conversion/ConversionPasses.td

include/NeuraDialect/NeuraOps.td

lib/NeuraDialect/NeuraPasses.cpp

tancheng · 2025-06-14T06:07:37Z

include/NeuraDialect/NeuraOps.td

 // NeuraOps.td - Custom operation definitions.

 include "NeuraDialect/NeuraDialect.td"
+include "mlir/IR/CommonTypeConstraints.td"


What is this used for?

Used for AnySignlessIntegerOrIndex, which was used before.

So you mean now we can remove this header?

Removed in latest commit

tancheng · 2025-06-14T06:09:10Z

include/NeuraDialect/NeuraPasses.td

+//=========================================================//
+// Passes for the CGRA Mapping
+//=========================================================//
+// def GenerateDFG : Pass<"generate-dfg", "ModuleOp"> {
+//   let summary = "Generates a Data Flow Graph (DFG) for the Neura dialect";
+//   let description =
+//       [{This pass generates a DFG from the Neura dialect operations.}];
+//   let constructor = "neura::createGenerateDFGPass()";
+// }
+


These are for what? Or remove?

They are for the ongoing neura-compiler. I just comment them because they are under development.

lib/CMakeLists.txt

lib/Conversion/AffineToNeura/AffineToNeuraPass.cpp

tancheng · 2025-06-14T06:13:57Z

lib/Conversion/AffineToNeura/AffineToNeuraPass.cpp

+    auto memref = loadOp.getMemref();
+    AffineMap map = loadOp.getAffineMap();
+    ValueRange mapOperands = loadOp.getMapOperands();
+    // Get the indices for the load operation


"// Get the indices for the load operation" -> "// Gets the indices for the load operation."

lib/Conversion/AffineToNeura/AffineToNeuraPass.cpp

tancheng · 2025-06-14T07:35:40Z

lib/NeuraDialect/CMakeLists.txt

+# # Set include paths for TableGen
+# set(MLIR_TABLEGEN_INCLUDES
+#     "-I${PROJECT_SOURCE_DIR}/include"
+#     "-I${PROJECT_SOURCE_DIR}/include/NeuraDialect"
+#     "-I${CMAKE_CURRENT_BINARY_DIR}/include/NeuraDialect")

-# Generate TableGen files
-set(LLVM_TARGET_DEFINITIONS ${PROJECT_SOURCE_DIR}/include/NeuraDialect/Neura.td)
-mlir_tablegen(Neura.h.inc -gen-op-decls ${MLIR_TABLEGEN_INCLUDES})
-mlir_tablegen(Neura.cpp.inc -gen-op-defs ${MLIR_TABLEGEN_INCLUDES})
-mlir_tablegen(NeuraDialect.h.inc -gen-dialect-decls ${MLIR_TABLEGEN_INCLUDES})
-mlir_tablegen(NeuraDialect.cpp.inc -gen-dialect-defs ${MLIR_TABLEGEN_INCLUDES})
-mlir_tablegen(NeuraTypes.h.inc -gen-typedef-decls ${MLIR_TABLEGEN_INCLUDES})
-mlir_tablegen(NeuraTypes.cpp.inc -gen-typedef-defs ${MLIR_TABLEGEN_INCLUDES})
-add_public_tablegen_target(MLIRNeuraDialectIncGen)
+# # Generate TableGen files
+# set(LLVM_TARGET_DEFINITIONS ${PROJECT_SOURCE_DIR}/include/NeuraDialect/Neura.td)
+# mlir_tablegen(Neura.h.inc -gen-op-decls ${MLIR_TABLEGEN_INCLUDES})
+# mlir_tablegen(Neura.cpp.inc -gen-op-defs ${MLIR_TABLEGEN_INCLUDES})
+# mlir_tablegen(NeuraDialect.h.inc -gen-dialect-decls ${MLIR_TABLEGEN_INCLUDES})
+# mlir_tablegen(NeuraDialect.cpp.inc -gen-dialect-defs ${MLIR_TABLEGEN_INCLUDES})
+# mlir_tablegen(NeuraTypes.h.inc -gen-typedef-decls ${MLIR_TABLEGEN_INCLUDES})
+# mlir_tablegen(NeuraTypes.cpp.inc -gen-typedef-defs ${MLIR_TABLEGEN_INCLUDES})
+# add_public_tablegen_target(MLIRNeuraDialectIncGen)


Thanks for the cleanup, we can remove these commented out code then?

tancheng · 2025-06-14T07:35:54Z

include/NeuraDialect/CMakeLists.txt

+# set(MLIR_TABLEGEN_INCLUDES 
+#     ${PROJECT_SOURCE_DIR}/include
+#     ${PROJECT_SOURCE_DIR}/include/NeuraDialect
+#     ${CMAKE_CURRENT_BINARY_DIR}/include/NeuraDialect
+#     ${MLIR_MAIN_INCLUDE_DIR}
+#     ${MLIR_INCLUDE_DIR})


tancheng · 2025-06-14T07:39:05Z

tools/neura-compiler/neura-compiler.cpp

@@ -0,0 +1,32 @@
+// neura-compiler.cpp


Can we implement this compiler in another standalone PR? We prefer tiny PR. I personally prefer per commit PR actually, though not that easy in most cases. But it would be much easier for reviewer.

tancheng · 2025-06-14T07:41:21Z

test/affine2neura/simpleloop/compile.sh

@@ -0,0 +1,3 @@
+/home/lucas/Project/NeuraCompiler/thirdparty/Polygeist/build/bin/cgeist ./simple.cpp -S --raise-scf-to-affine -o ./simple.mlir


We don't have lucas in github action. Can't we follow https://github.com/coredac/dataflow/blob/main/test/c2llvm2mlir/test.mlir?

And I don't understand what your are trying to verify, it seems you just compile the .cpp and lower the corresponding .mlir, as long as there is no error, you are fine? We can at least CHECK sth in a .mlir test.

After updating with the newest commit in the main branch, it seems I can not trigger the actions automatically.

no idea, change the yml and try again?

tancheng · 2025-06-14T07:47:52Z

Thanks @ShangkunLi for the effort! Awesome work. Most of my comments are about coding style.

And I think you also need to

handle the neura's ctrl flow to data flow pass when there is neura.loop_xxx
- can be done in another PR, but please file an issue
and I don't see any mlir tests in your PR
- all the new ops introduced by this PR needs to be tested, including loop_iter, by lowering from affine dialect

Thanks~!

tancheng · 2025-06-17T15:08:04Z

Hi @ShangkunLi, this PR is ready to review again? After fixing the comments, plz briefly reply in the conversation box about how you fix it (just briefly summarize, or simply "Done" or "Fixed"), so I got notification and I will click the "resolved" button. Thanks~!

ShangkunLi · 2025-06-17T15:16:02Z

Hi @ShangkunLi, this PR is ready to review again? After fixing the comments, plz briefly reply in the conversation box about how you fix it (just briefly summarize, or simply "Done" or "Fixed"), so I got notification and I will click the "resolved" button. Thanks~!

@tancheng

Hi~ Cheng,

I am trying to convert the loop_control into dataflow. But I do have some trouble, maybe we can discuss it in weekly sync and decide whether to keep this operation/pr.

tancheng · 2025-06-17T15:19:39Z

Hi @ShangkunLi, this PR is ready to review again? After fixing the comments, plz briefly reply in the conversation box about how you fix it (just briefly summarize, or simply "Done" or "Fixed"), so I got notification and I will click the "resolved" button. Thanks~!

@tancheng

Hi~ Cheng,

I am trying to convert the loop_control into dataflow. But I do have some trouble, maybe we can discuss it in weekly sync and decide whether to keep this operation/pr.

Sure~

ShangkunLi added 4 commits June 12, 2025 21:17

add integrated operations for memory access and loop control

fe46cdb

convert affine load/store/for into neura dialect

93955fa

lower affine.apply to neura.add

7e16547

fix bugs

ae6134a

ShangkunLi requested review from tancheng and yuqisun June 13, 2025 11:11

ShangkunLi added 2 commits June 13, 2025 21:37

fix dyn_cast bugs

8c3c6e5

set correct MLIR path in Cmakelists

0c24064

tancheng reviewed Jun 14, 2025

View reviewed changes

ShangkunLi added 4 commits June 14, 2025 10:34

change the target of arith2neura and affine2neura

2f34550

change the operand type in neura.add

5bc4884

change the assembly format in neura.loop_control

c0a861c

change the assembly format in neura.loop_control

083aea8

ShangkunLi commented Jun 14, 2025

View reviewed changes

lib/NeuraDialect/NeuraPasses.cpp Show resolved Hide resolved

tancheng reviewed Jun 14, 2025

View reviewed changes

ShangkunLi added 3 commits June 17, 2025 11:59

support lowring affine to neura & add some testes

e917f0a

Merge branch 'main' into lower-affine-dev

768b208

[fix] solve confics with main

6594f4a

ShangkunLi requested a review from tancheng June 17, 2025 07:19

[test] add test

9d4ee25

tancheng mentioned this pull request Oct 23, 2025

feat: Implement AffineToNeura pass with loop nest analysis and valid signal optimization #173

Merged

		@@ -0,0 +1,3 @@
		/home/lucas/Project/NeuraCompiler/thirdparty/Polygeist/build/bin/cgeist ./simple.cpp -S --raise-scf-to-affine -o ./simple.mlir

Lower Affine Dialect to Nerua Dialect #31

Are you sure you want to change the base?

Lower Affine Dialect to Nerua Dialect #31

Uh oh!

Conversation

ShangkunLi commented Jun 13, 2025

Uh oh!

tancheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tancheng commented Jun 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tancheng commented Jun 17, 2025

Uh oh!

ShangkunLi commented Jun 17, 2025

Uh oh!

tancheng commented Jun 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tancheng commented Jun 14, 2025 •

edited

Loading