forked from DeepRec-AI/DeepRec
-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Reduce weights packing/unpacking overhead in multi matmul
Goal
Optimize performance through reducing packing/unpacking overhead in multi matmul operations
Problem Description
In some models, there will be multi continuous matmul operations. For each matmul operation, there will be packing/unpacking in order to improve cache locality. Any packing and unpacking will bring cost. So it's possible that packing could be done before the 1st matmul operation and then do unpacking after the last matmul. Below is the picture which can used to described the method.
Requirement Details
- Write a matmul with C++ and prepare the baseline code.
- Finish this optimization to achieve a PoC.
- Supply unit test code to validate the function.
- Integrate it into DeepRec through Grappler mechanism.
- Apply the optimization on 1 model and show the performance data.
Test
- Using 1 model from model zoo to validate the performance gain. The performance data and analysis result could be described and reproduced.
Code Style and commit
- C++ and python: Keep aligned with DeepRec code.
Maintain
- All of the issue and bugs related with this op need to be covered in the future.
Definition of Done
- Run successfully in DeepRec and could get better performance.
- Integrated into DeepRec successfully and commit the code follow DeepRec commit standard.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request