-
Notifications
You must be signed in to change notification settings - Fork 14
Support Customized DFG Sorting Strategy #146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| // Two sorting strategies: pure topological order, or mixed ALAP + topo. | ||
| std::vector<std::pair<Operation *, int>> sorted_ops_with_levels; | ||
| if (sort_strategy_string_ref == "topological") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please remind me what's wrong with "mixed"? You plan to fix "mixed" if no "due to time limit"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case:
module {
func.func @simple_add_loop() -> i64 attributes {accelerator = "neura", dataflow_mode = "steering"} {
%0 = neura.reserve : i64
%1 = neura.reserve : i64
%2 = neura.reserve : i1
%3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
%4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
%7 = neura.invariant %4, %2 : i64, i1 -> i64
%8 = neura.invariant %3, %2 : i64, i1 -> i64
%9 = neura.carry %5, %2, %0 : i64, i1, i64 -> i64
%10 = neura.carry %6, %2, %1 : i64, i1, i64 -> i64
%11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1
neura.ctrl_mov %11 -> %2 : i1 i1
%12 = neura.false_steer %9, %11 : i64, i1 -> i64
%13 = "neura.add"(%9, %9) : (i64, i64) -> i64
neura.ctrl_mov %13 -> %0 : i64 i64
%14 = "neura.add"(%10, %7) : (i64, i64) -> i64
neura.ctrl_mov %14 -> %1 : i64 i64
"neura.return"(%12) : (i64) -> ()
}
}
Its topological order is:
[MapToAcceleratorPass] Topologically sorted op: %0 = neura.reserve : i64
[MapToAcceleratorPass] Topologically sorted op: %1 = neura.reserve : i64
[MapToAcceleratorPass] Topologically sorted op: %2 = neura.reserve : i1
[MapToAcceleratorPass] Topologically sorted op: %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %9 = "neura.data_mov"(%3) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %7 = "neura.data_mov"(%4) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %11 = "neura.data_mov"(%5) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %13 = "neura.data_mov"(%6) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %10 = neura.invariant %9, %2 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %8 = neura.invariant %7, %2 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %12 = neura.carry %11, %2, %0 : i64, i1, i64 -> i64
[MapToAcceleratorPass] Topologically sorted op: %14 = neura.carry %13, %2, %1 : i64, i1, i64 -> i64
[MapToAcceleratorPass] Topologically sorted op: %16 = "neura.data_mov"(%10) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %25 = "neura.data_mov"(%8) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %21 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %22 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %18 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %24 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %15 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %23 = "neura.add"(%21, %22) : (i64, i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %26 = "neura.add"(%24, %25) : (i64, i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %23 -> %0 : i64 i64
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %26 -> %1 : i64 i64
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %17 -> %2 : i1 i1
[MapToAcceleratorPass] Topologically sorted op: %19 = "neura.data_mov"(%17) : (i1) -> i1
[MapToAcceleratorPass] Topologically sorted op: %20 = neura.false_steer %18, %19 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %27 = "neura.data_mov"(%20) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: "neura.return"(%27) : (i64) -> ()
It's mixed sorted order is:
[MapToAcceleratorPass] ALAP Bucket Level 0: 6 ops
%1 = neura.reserve : i64
%2 = neura.reserve : i1
%3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
%6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
%9 = "neura.data_mov"(%3) : (i64) -> i64
%13 = "neura.data_mov"(%6) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 1: 8 ops
%0 = neura.reserve : i64
%5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%11 = "neura.data_mov"(%5) : (i64) -> i64
%10 = neura.invariant %9, %2 : i64, i1 -> i64
%14 = neura.carry %13, %2, %1 : i64, i1, i64 -> i64
%16 = "neura.data_mov"(%10) : (i64) -> i64
%24 = "neura.data_mov"(%14) : (i64) -> i64
%15 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 2: 9 ops
%4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%7 = "neura.data_mov"(%4) : (i64) -> i64
%12 = neura.carry %11, %2, %0 : i64, i1, i64 -> i64
%21 = "neura.data_mov"(%12) : (i64) -> i64
%22 = "neura.data_mov"(%12) : (i64) -> i64
%18 = "neura.data_mov"(%12) : (i64) -> i64
%17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1
neura.ctrl_mov %17 -> %2 : i1 i1
%19 = "neura.data_mov"(%17) : (i1) -> i1
[MapToAcceleratorPass] ALAP Bucket Level 3: 6 ops
%8 = neura.invariant %7, %2 : i64, i1 -> i64
%25 = "neura.data_mov"(%8) : (i64) -> i64
%23 = "neura.add"(%21, %22) : (i64, i64) -> i64
neura.ctrl_mov %23 -> %0 : i64 i64
%20 = neura.false_steer %18, %19 : i64, i1 -> i64
%27 = "neura.data_mov"(%20) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 4: 3 ops
%26 = "neura.add"(%24, %25) : (i64, i64) -> i64
neura.ctrl_mov %26 -> %1 : i64 i64
"neura.return"(%27) : (i64) -> ()
Here %8 = neura.invariant %7, %2 : i64, i1 -> i64 is a backward user of %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1, but it's ALAP level is higher than %17. Making it unable to satisfy the producer-consumer dependency check in
| assert(!user_locs.empty() && "No locations found for backward user"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is not correctly recognized as a critical op?
dataflow/lib/NeuraDialect/Mapping/mapping_util.cpp
Lines 306 to 311 in 81d782e
| // Step 3: Overwrites critical ops with ASAP schedule: shortest path from | |
| // source. | |
| for (Operation *op : sorted_ops) { | |
| if (!critical_ops.count(op)) { | |
| continue; | |
| } |
File an issue and resolve it later?
Due to the time limit, I just enabled the customized sorting strategy for steering-based dataflow IR mapping.
Maybe we can properly design the mixed sorting later. For now, I just use topologically sorted ops for evaluation.