-
Notifications
You must be signed in to change notification settings - Fork 49
Add uni_gatherdps and uni_scatterdps #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uni_gatherdps and uni_scatterdps #145
Conversation
|
@dmitry-gorokhov Please, take a look at this PR and assign reviewers |
| const size_t kDataTypeSize = sizeof(float); | ||
| if (is_valid_isa(cpu_isa_t::avx512_core)) { | ||
| assert(reg_mask.isOPMASK()); | ||
| vgatherdps(xmm_val, ptr[reg_addr + xmm_index * scale + disp]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we apply reg_mask here, in addition to xmm_val in vgatherdps()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, according the documentation the reg_mask will be applied implicitly, see section:
https://www.felixcloutier.com/x86/vgatherdps:vgatherdpd#instruction-operand-encoding
|
If I understand correctly, we are going to do onednn reduction, to move functions from onednn to openvino. @dmitry-gorokhov would you please give some guideline on these custom instructions? |
|
I am ok to merge here simple uni_* mnemonics which actual serve here as pure wrappers to map register type on correct instruction. Once base generator class for all CPU Plugin Jit kernels is introduced we will be able to move all this custom mnemonics into CPU plugin codebase. Here is different situation. Gather/Scatter mnemonics are complex and contain complicated logic of emulation for legacy ISA and registers allocation. My recommendation for such cases is to implement the logic under jit_emitter hierarchy and put into CPU plugin codebase. You can use Load Emitter as an example since it is widely used in the plugin. |
@avoskoboinyk-lohika @lohika-denis-kotov Do you have more questions about this recommendation? |
|
I am closing this PR. |
Description
This PR adds uni_gatherdps and uni_scatterdps for simplifying writing kernels.
Consider the following usage of this operations: