Skip to content

Conversation

@ShangkunLi
Copy link
Collaborator

In this PR, the following tasks are finished:

  1. Define 4 neura operations to support affine loops
  • neura.loop_iter: Used for loop iteration organization, you can treat it like a combination of phi-add-cmp in llvm
  • neura.loop_control: Used for loop iteration and control, you can treat it like a combination of phi-add-cmp-br in llvm
  • neura.load_indexed: An operation used for fast array memory access, you can treat it like a combination of gep-load in llvm
  • neura.store_indexed: An operation used for fast array memory access, you can treat it like a combination of gep-store in llvm
  1. A pass that can lower some simple nested affine loops into neura dialect. Below is a simple example:
    Source Affine mlir generated by Polygeist from .cpp
module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<!llvm.ptr<272>, dense<64> : vector<4xi32>>, #dlti.dl_entry<!llvm.ptr<271>, dense<32> : vector<4xi32>>, #dlti.dl_entry<i64, dense<64> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr<270>, dense<32> : vector<4xi32>>, #dlti.dl_entry<f80, dense<128> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr, dense<64> : vector<4xi32>>, #dlti.dl_entry<i1, dense<8> : vector<2xi32>>, #dlti.dl_entry<i8, dense<8> : vector<2xi32>>, #dlti.dl_entry<i16, dense<16> : vector<2xi32>>, #dlti.dl_entry<i32, dense<32> : vector<2xi32>>, #dlti.dl_entry<f16, dense<16> : vector<2xi32>>, #dlti.dl_entry<f64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f128, dense<128> : vector<2xi32>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i32>>, llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu", "polygeist.target-cpu" = "x86-64", "polygeist.target-features" = "+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87", "polygeist.tune-cpu" = "generic"} {
  memref.global @C : memref<100xf32> = uninitialized
  memref.global @A : memref<100xf32> = uninitialized
  func.func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} {
    %cst = arith.constant 1.000000e+01 : f32
    %c0_i32 = arith.constant 0 : i32
    %0 = memref.get_global @A : memref<100xf32>
    %1 = memref.get_global @C : memref<100xf32>
    affine.for %arg0 = 0 to 100 {
      %2 = affine.load %0[%arg0] : memref<100xf32>
      %3 = arith.mulf %2, %cst : f32
      affine.store %3, %1[%arg0] : memref<100xf32>
    }
    return %c0_i32 : i32
  }
}

Lowered Neura mlir by applying this pass:

module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<!llvm.ptr<272>, dense<64> : vector<4xi32>>, #dlti.dl_entry<!llvm.ptr<271>, dense<32> : vector<4xi32>>, #dlti.dl_entry<i64, dense<64> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr<270>, dense<32> : vector<4xi32>>, #dlti.dl_entry<f80, dense<128> : vector<2xi32>>, #dlti.dl_entry<!llvm.ptr, dense<64> : vector<4xi32>>, #dlti.dl_entry<i1, dense<8> : vector<2xi32>>, #dlti.dl_entry<i8, dense<8> : vector<2xi32>>, #dlti.dl_entry<i16, dense<16> : vector<2xi32>>, #dlti.dl_entry<i32, dense<32> : vector<2xi32>>, #dlti.dl_entry<f16, dense<16> : vector<2xi32>>, #dlti.dl_entry<f64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f128, dense<128> : vector<2xi32>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i32>>, llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu", "polygeist.target-cpu" = "x86-64", "polygeist.target-features" = "+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87", "polygeist.tune-cpu" = "generic"} {
  memref.global @C : memref<100xf32> = uninitialized
  memref.global @A : memref<100xf32> = uninitialized
  func.func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} {
    %cst = arith.constant 1.000000e+01 : f32
    %c0_i32 = arith.constant 0 : i32
    %0 = memref.get_global @A : memref<100xf32>
    %1 = memref.get_global @C : memref<100xf32>
    %2 = neura.constant {value = 0 : index} : index
    %3 = neura.constant {value = 100 : index} : index
    %4 = neura.constant {value = 1 : index} : index
    neura.br %2 : index to ^bb2
  ^bb1:  // pred: ^bb4
    return %c0_i32 : i32
  ^bb2(%5: index):  // 2 preds: ^bb0, ^bb3
    neura.loop_control current_index : %5, step : %4, bound : %3, loop_type : "lt" then ^bb3 else ^bb4
  ^bb3(%6: index):  // pred: ^bb2
    %7 = neura.load_indexed memref<100xf32> %0[%6] : f32
    %8 = arith.mulf %7, %cst : f32
    neura.store_indexed %8 to memref<100xf32> %1[%6] : f32
    neura.br %6 : index to ^bb2
  ^bb4:  // pred: ^bb2
    neura.br :  to ^bb1
  }
}

To Do:

  • A neura compiler used for further development
  • A DFG construction pass used in neura compiler
  • A memory-aware mapping pass that fully utilizes the computation resources

@ShangkunLi ShangkunLi requested review from tancheng and yuqisun June 13, 2025 11:11
Copy link
Contributor

@tancheng tancheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done half of the review, will continue later..

// NeuraOps.td - Custom operation definitions.

include "NeuraDialect/NeuraDialect.td"
include "mlir/IR/CommonTypeConstraints.td"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used for AnySignlessIntegerOrIndex, which was used before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you mean now we can remove this header?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in latest commit

Comment on lines 52 to 61
//=========================================================//
// Passes for the CGRA Mapping
//=========================================================//
// def GenerateDFG : Pass<"generate-dfg", "ModuleOp"> {
// let summary = "Generates a Data Flow Graph (DFG) for the Neura dialect";
// let description =
// [{This pass generates a DFG from the Neura dialect operations.}];
// let constructor = "neura::createGenerateDFGPass()";
// }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are for what? Or remove?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are for the ongoing neura-compiler. I just comment them because they are under development.

auto memref = loadOp.getMemref();
AffineMap map = loadOp.getAffineMap();
ValueRange mapOperands = loadOp.getMapOperands();
// Get the indices for the load operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"// Get the indices for the load operation" -> "// Gets the indices for the load operation."

Comment on lines +1 to +15
# # Set include paths for TableGen
# set(MLIR_TABLEGEN_INCLUDES
# "-I${PROJECT_SOURCE_DIR}/include"
# "-I${PROJECT_SOURCE_DIR}/include/NeuraDialect"
# "-I${CMAKE_CURRENT_BINARY_DIR}/include/NeuraDialect")

# Generate TableGen files
set(LLVM_TARGET_DEFINITIONS ${PROJECT_SOURCE_DIR}/include/NeuraDialect/Neura.td)
mlir_tablegen(Neura.h.inc -gen-op-decls ${MLIR_TABLEGEN_INCLUDES})
mlir_tablegen(Neura.cpp.inc -gen-op-defs ${MLIR_TABLEGEN_INCLUDES})
mlir_tablegen(NeuraDialect.h.inc -gen-dialect-decls ${MLIR_TABLEGEN_INCLUDES})
mlir_tablegen(NeuraDialect.cpp.inc -gen-dialect-defs ${MLIR_TABLEGEN_INCLUDES})
mlir_tablegen(NeuraTypes.h.inc -gen-typedef-decls ${MLIR_TABLEGEN_INCLUDES})
mlir_tablegen(NeuraTypes.cpp.inc -gen-typedef-defs ${MLIR_TABLEGEN_INCLUDES})
add_public_tablegen_target(MLIRNeuraDialectIncGen)
# # Generate TableGen files
# set(LLVM_TARGET_DEFINITIONS ${PROJECT_SOURCE_DIR}/include/NeuraDialect/Neura.td)
# mlir_tablegen(Neura.h.inc -gen-op-decls ${MLIR_TABLEGEN_INCLUDES})
# mlir_tablegen(Neura.cpp.inc -gen-op-defs ${MLIR_TABLEGEN_INCLUDES})
# mlir_tablegen(NeuraDialect.h.inc -gen-dialect-decls ${MLIR_TABLEGEN_INCLUDES})
# mlir_tablegen(NeuraDialect.cpp.inc -gen-dialect-defs ${MLIR_TABLEGEN_INCLUDES})
# mlir_tablegen(NeuraTypes.h.inc -gen-typedef-decls ${MLIR_TABLEGEN_INCLUDES})
# mlir_tablegen(NeuraTypes.cpp.inc -gen-typedef-defs ${MLIR_TABLEGEN_INCLUDES})
# add_public_tablegen_target(MLIRNeuraDialectIncGen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup, we can remove these commented out code then?

Comment on lines +2 to +7
# set(MLIR_TABLEGEN_INCLUDES
# ${PROJECT_SOURCE_DIR}/include
# ${PROJECT_SOURCE_DIR}/include/NeuraDialect
# ${CMAKE_CURRENT_BINARY_DIR}/include/NeuraDialect
# ${MLIR_MAIN_INCLUDE_DIR}
# ${MLIR_INCLUDE_DIR})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

@@ -0,0 +1,32 @@
// neura-compiler.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we implement this compiler in another standalone PR? We prefer tiny PR. I personally prefer per commit PR actually, though not that easy in most cases. But it would be much easier for reviewer.

@@ -0,0 +1,3 @@
/home/lucas/Project/NeuraCompiler/thirdparty/Polygeist/build/bin/cgeist ./simple.cpp -S --raise-scf-to-affine -o ./simple.mlir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have lucas in github action. Can't we follow https://github.com/coredac/dataflow/blob/main/test/c2llvm2mlir/test.mlir?

And I don't understand what your are trying to verify, it seems you just compile the .cpp and lower the corresponding .mlir, as long as there is no error, you are fine? We can at least CHECK sth in a .mlir test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After updating with the newest commit in the main branch, it seems I can not trigger the actions automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea, change the yml and try again?

@tancheng
Copy link
Contributor

tancheng commented Jun 14, 2025

Thanks @ShangkunLi for the effort! Awesome work. Most of my comments are about coding style.

And I think you also need to

  • handle the neura's ctrl flow to data flow pass when there is neura.loop_xxx
    • can be done in another PR, but please file an issue
  • and I don't see any mlir tests in your PR
    • all the new ops introduced by this PR needs to be tested, including loop_iter, by lowering from affine dialect

Thanks~!

@ShangkunLi ShangkunLi requested a review from tancheng June 17, 2025 07:19
@tancheng
Copy link
Contributor

Hi @ShangkunLi, this PR is ready to review again? After fixing the comments, plz briefly reply in the conversation box about how you fix it (just briefly summarize, or simply "Done" or "Fixed"), so I got notification and I will click the "resolved" button. Thanks~!

@ShangkunLi
Copy link
Collaborator Author

Hi @ShangkunLi, this PR is ready to review again? After fixing the comments, plz briefly reply in the conversation box about how you fix it (just briefly summarize, or simply "Done" or "Fixed"), so I got notification and I will click the "resolved" button. Thanks~!

@tancheng

Hi~ Cheng,

I am trying to convert the loop_control into dataflow. But I do have some trouble, maybe we can discuss it in weekly sync and decide whether to keep this operation/pr.

@tancheng
Copy link
Contributor

Hi @ShangkunLi, this PR is ready to review again? After fixing the comments, plz briefly reply in the conversation box about how you fix it (just briefly summarize, or simply "Done" or "Fixed"), so I got notification and I will click the "resolved" button. Thanks~!

@tancheng

Hi~ Cheng,

I am trying to convert the loop_control into dataflow. But I do have some trouble, maybe we can discuss it in weekly sync and decide whether to keep this operation/pr.

Sure~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants