Skip to content

[Graph][Optimization]Reduce weights packing/unpacking overhead during the scenario of multi matmul  #27

@shanzhou2186

Description

@shanzhou2186

Reduce weights packing/unpacking overhead in multi matmul
Goal
Optimize performance through reducing packing/unpacking overhead in multi matmul operations

Problem Description
In some models, there will be multi continuous matmul operations. For each matmul operation, there will be packing/unpacking in order to improve cache locality. Any packing and unpacking will bring cost. So it's possible that packing could be done before the 1st matmul operation and then do unpacking after the last matmul. Below is the picture which can used to described the method.

Capture-packing-opt

Requirement Details

  • Write a matmul with C++ and prepare the baseline code.
  • Finish this optimization to achieve a PoC.
  • Supply unit test code to validate the function.
  • Integrate it into DeepRec through Grappler mechanism.
  • Apply the optimization on 1 model and show the performance data.

Test

  • Using 1 model from model zoo to validate the performance gain. The performance data and analysis result could be described and reproduced.

Code Style and commit

  • C++ and python: Keep aligned with DeepRec code.

Maintain

  • All of the issue and bugs related with this op need to be covered in the future.

Definition of Done

  • Run successfully in DeepRec and could get better performance.
  • Integrated into DeepRec successfully and commit the code follow DeepRec commit standard.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions