Skip to content

Conversation

@ShangkunLi
Copy link
Collaborator

Due to the time limit, I just enabled the customized sorting strategy for steering-based dataflow IR mapping.

Maybe we can properly design the mixed sorting later. For now, I just use topologically sorted ops for evaluation.

@ShangkunLi ShangkunLi marked this pull request as ready for review October 8, 2025 14:45

// Two sorting strategies: pure topological order, or mixed ALAP + topo.
std::vector<std::pair<Operation *, int>> sorted_ops_with_levels;
if (sort_strategy_string_ref == "topological") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please remind me what's wrong with "mixed"? You plan to fix "mixed" if no "due to time limit"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case:

module {
  func.func @simple_add_loop() -> i64 attributes {accelerator = "neura", dataflow_mode = "steering"} {
    %0 = neura.reserve : i64
    %1 = neura.reserve : i64
    %2 = neura.reserve : i1
    %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
    %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
    %7 = neura.invariant %4, %2 : i64, i1 -> i64
    %8 = neura.invariant %3, %2 : i64, i1 -> i64
    %9 = neura.carry %5, %2, %0 : i64, i1, i64 -> i64
    %10 = neura.carry %6, %2, %1 : i64, i1, i64 -> i64
    %11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1
    neura.ctrl_mov %11 -> %2 : i1 i1
    %12 = neura.false_steer %9, %11 : i64, i1 -> i64
    %13 = "neura.add"(%9, %9) : (i64, i64) -> i64
    neura.ctrl_mov %13 -> %0 : i64 i64
    %14 = "neura.add"(%10, %7) : (i64, i64) -> i64
    neura.ctrl_mov %14 -> %1 : i64 i64
    "neura.return"(%12) : (i64) -> ()
  }
}

Its topological order is:

[MapToAcceleratorPass] Topologically sorted op: %0 = neura.reserve : i64
[MapToAcceleratorPass] Topologically sorted op: %1 = neura.reserve : i64
[MapToAcceleratorPass] Topologically sorted op: %2 = neura.reserve : i1
[MapToAcceleratorPass] Topologically sorted op: %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %9 = "neura.data_mov"(%3) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %7 = "neura.data_mov"(%4) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %11 = "neura.data_mov"(%5) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %13 = "neura.data_mov"(%6) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %10 = neura.invariant %9, %2 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %8 = neura.invariant %7, %2 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %12 = neura.carry %11, %2, %0 : i64, i1, i64 -> i64
[MapToAcceleratorPass] Topologically sorted op: %14 = neura.carry %13, %2, %1 : i64, i1, i64 -> i64
[MapToAcceleratorPass] Topologically sorted op: %16 = "neura.data_mov"(%10) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %25 = "neura.data_mov"(%8) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %21 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %22 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %18 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %24 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %15 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %23 = "neura.add"(%21, %22) : (i64, i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %26 = "neura.add"(%24, %25) : (i64, i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %23 -> %0 : i64 i64
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %26 -> %1 : i64 i64
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %17 -> %2 : i1 i1
[MapToAcceleratorPass] Topologically sorted op: %19 = "neura.data_mov"(%17) : (i1) -> i1
[MapToAcceleratorPass] Topologically sorted op: %20 = neura.false_steer %18, %19 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %27 = "neura.data_mov"(%20) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: "neura.return"(%27) : (i64) -> ()

It's mixed sorted order is:

[MapToAcceleratorPass] ALAP Bucket Level 0: 6 ops
  %1 = neura.reserve : i64
  %2 = neura.reserve : i1
  %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
  %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
  %9 = "neura.data_mov"(%3) : (i64) -> i64
  %13 = "neura.data_mov"(%6) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 1: 8 ops
  %0 = neura.reserve : i64
  %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
  %11 = "neura.data_mov"(%5) : (i64) -> i64
  %10 = neura.invariant %9, %2 : i64, i1 -> i64
  %14 = neura.carry %13, %2, %1 : i64, i1, i64 -> i64
  %16 = "neura.data_mov"(%10) : (i64) -> i64
  %24 = "neura.data_mov"(%14) : (i64) -> i64
  %15 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 2: 9 ops
  %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
  %7 = "neura.data_mov"(%4) : (i64) -> i64
  %12 = neura.carry %11, %2, %0 : i64, i1, i64 -> i64
  %21 = "neura.data_mov"(%12) : (i64) -> i64
  %22 = "neura.data_mov"(%12) : (i64) -> i64
  %18 = "neura.data_mov"(%12) : (i64) -> i64
  %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1
  neura.ctrl_mov %17 -> %2 : i1 i1
  %19 = "neura.data_mov"(%17) : (i1) -> i1
[MapToAcceleratorPass] ALAP Bucket Level 3: 6 ops
  %8 = neura.invariant %7, %2 : i64, i1 -> i64
  %25 = "neura.data_mov"(%8) : (i64) -> i64
  %23 = "neura.add"(%21, %22) : (i64, i64) -> i64
  neura.ctrl_mov %23 -> %0 : i64 i64
  %20 = neura.false_steer %18, %19 : i64, i1 -> i64
  %27 = "neura.data_mov"(%20) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 4: 3 ops
  %26 = "neura.add"(%24, %25) : (i64, i64) -> i64
  neura.ctrl_mov %26 -> %1 : i64 i64
  "neura.return"(%27) : (i64) -> ()

Here %8 = neura.invariant %7, %2 : i64, i1 -> i64 is a backward user of %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1, but it's ALAP level is higher than %17. Making it unable to satisfy the producer-consumer dependency check in

assert(!user_locs.empty() && "No locations found for backward user");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is not correctly recognized as a critical op?

// Step 3: Overwrites critical ops with ASAP schedule: shortest path from
// source.
for (Operation *op : sorted_ops) {
if (!critical_ops.count(op)) {
continue;
}

File an issue and resolve it later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants