Skip to content

Conversation

@ShangkunLi
Copy link
Collaborator

@ShangkunLi ShangkunLi commented Aug 13, 2025

Background

  1. In the existing implementation, we calculate the award based on the producers' location.
  2. For operations without any producers, we now select them randomly or solely based on the temporal information.

For example, for %1 = "neura.grant_once"() <{constant_value = 0 : i64}> : () -> !neura.data<i64, i1>, it doesn't have any producers, so the candidates for it are chosen randomly or based solely on the time.

  Loc: tile#0 @t=0, Award: 8
  Loc: tile#7 @t=0, Award: 8
  Loc: tile#6 @t=0, Award: 8
  Loc: tile#8 @t=0, Award: 8
  Loc: tile#5 @t=0, Award: 8
  Loc: tile#9 @t=0, Award: 8
  Loc: tile#10 @t=0, Award: 8
  Loc: tile#4 @t=0, Award: 8
  Loc: tile#11 @t=0, Award: 8
  Loc: tile#3 @t=0, Award: 8
  Loc: tile#12 @t=0, Award: 8
  Loc: tile#13 @t=0, Award: 8
  Loc: tile#2 @t=0, Award: 8
  Loc: tile#14 @t=0, Award: 8
  Loc: tile#1 @t=0, Award: 8
  Loc: tile#15 @t=0, Award: 8
  Loc: tile#12 @t=1, Award: 7
  Loc: tile#11 @t=1, Award: 7
  Loc: tile#10 @t=1, Award: 7
  Loc: tile#13 @t=1, Award: 7
  Loc: tile#9 @t=1, Award: 7
  Loc: tile#14 @t=1, Award: 7
  Loc: tile#8 @t=1, Award: 7
  Loc: tile#7 @t=1, Award: 7
  Loc: tile#15 @t=1, Award: 7
  Loc: tile#0 @t=1, Award: 7
  Loc: tile#2 @t=1, Award: 7
  Loc: tile#6 @t=1, Award: 7
  Loc: tile#5 @t=1, Award: 7
  Loc: tile#3 @t=1, Award: 7
  Loc: tile#1 @t=1, Award: 7
  Loc: tile#4 @t=1, Award: 7

However, these randomly chosen candidate locations may result in sub-optimal mapping results. And if we want to modify the assigned locations for these operations, we need to set a large max_backtrack_depth (it takes ~10 hours and is unrealistic).

New heuristic

  1. Calculate the award for no-producer operations based on a greedy (max_locations_to_try=user_defined_num, equals to 4*4=16 in now implementation, max_backtrack_depth=0 ) mapping result.
  2. Use the number of successfully mapped successor operations as the award.

And the mapping results show an improvement in both spatial-only and spatial-temporal mapping.

p.s. I also tried an A* search-like heuristic and took tile utilization and routing congestion into consideration, that is, evaluate the award based on both producers and consumers for an operation. But the result is even worse (e.g., compiled_ii=9 for branch_for.mlir), this is because we introduce too many weighted factors, and it is hard to choose appropriate values.

@tancheng
Copy link
Contributor

How is the producer being mapped after its consumer? Can you explain this in your description?

@ShangkunLi ShangkunLi marked this pull request as ready for review August 13, 2025 14:38
Comment on lines 227 to 229
// Because we have already thoroughly iterated candidates of no-producer
// operations. If the current operation is after a no-producer operation, we
// need backtrack 2 depths.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? And backtrack 2 depth/layer means a no-producer op is also backtracked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So then we re-map the backtracked no-producer op? say,
<no_producer_op0, no_producer_op1, no_producer_op2, current_op, next_op>, backtracking 2 steps means we restart the mapping of the second highest award loc for no_producer_op2 or no_producer_op1?

"If the current operation is after a no-producer operation", does "after" mean "immediate after" or just "after"? Let's say "next_op" fail, we can just fall back "current_op", right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the bug here. If current_op fails, now we will backtrack to the mapping state before mapping no_producer_op1; it still is in an infinite loop.

The correct solution is to backtrack until it is not a no_producer_op, because only in this case can we result in an effective backtrack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ShangkunLi, could you please add an example as comments above the function, like:

<no_producer_op0, no_producer_op1, no_producer_op2, op3, op4>

          |              |                  |        |    |

      | loc_0 |      | loc_3 | ...
      | loc_1 |      | loc_4 | ...
      | loc_2 |      | loc_5 | ...

explain what happen for different cases:

  • op4 failed
  • op3 failed
  • no_producer_op2 failed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit hesitate to check in this feature... Such looking ahead is too complicated for our code structure, and error-prone... I believe simply assigning initial loc for no-producer-op might be able to achieve similar benefit...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the comments in the latest commit

Feel free to take it or not~ Just try to improve the II, I will also try to map the recurrence cycle first and see if it can improve the II.


// 1. If we fail to map op4, we can backtrack to op3 using this
// performBacktrack function.
// 2. If we fail to map op3, we need to backtrack to op0 to avoid getting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we would have op0 before no_producer_op1? Shouldn't ops be ordered in topological ordering?

@tancheng
Copy link
Contributor

Let's check in the PR of "the critical path mapping first". Thanks~!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants