Skip to content

HGQ QMHA and QLinformer support for oneAPI #1432

Draft
laurilaatu wants to merge 9 commits intofastmachinelearning:mainfrom
laurilaatu:oneapi_qmha
Draft

HGQ QMHA and QLinformer support for oneAPI #1432
laurilaatu wants to merge 9 commits intofastmachinelearning:mainfrom
laurilaatu:oneapi_qmha

Conversation

@laurilaatu
Copy link
Contributor

Description

This PR adds the required changes to enable QMHA/QLinformer in oneAPI.
Requires parsing softmax tables from HGQ and support for multidim softmax.

Type of change

  • New feature (non-breaking change which adds functionality)

Tests

Test Configuration:

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

@jmitrevs jmitrevs added the please test Trigger testing by creating local PR branch label Feb 9, 2026
@jmitrevs
Copy link
Contributor

jmitrevs commented Feb 9, 2026

How are the names of tables associated with the layers? Is it now per-instance?

@jmitrevs jmitrevs added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Feb 11, 2026
y[0] = 1;
return y.to_uint();
// Extract the lower 'width' bits of x
return x.template slc<data_T::width>(0).to_uint();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a logic change. Before it took the upper bits. Does it now assume that the table size is always the same as the width of the value? If so, I guess it must be reinforced somewhere. Are we sure we want this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the logic is unchanged for Vivado and libero, so I am hesitant to have this change.

copyfile(srcpath, dstpath)

def __get_table_size(self, model, activation):
def __get_table_size(self, model, activation, table_name='table_size'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

table_name and table_size have very different meanings. Maybe table_name_size or something like that would read better?

@jmitrevs
Copy link
Contributor

I assume this also needs changes in nnet_activation_stream.h, at least to consistently use the types.

@jmitrevs
Copy link
Contributor

Tests should be written (or hgq tests enabled) to show that this works as expected.

@jmitrevs
Copy link
Contributor

Also consider the libero backend for how tables are created. We may want to unify the methods. oneAPI followed the Quartus style of LUTs, which is quite old.

@jmitrevs
Copy link
Contributor

Scratch that about the Libero backend. Apparently it's just cut and paste code that has not been used or tested. So overall, I think this is fine other than potentially the softmax_stable_idx_from_real_val change, which should be checked. And also there is the need to update the streaming side to use the same constants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

please test Trigger testing by creating local PR branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants