Fix Invalid: Refactor operations and tensorfactory #211

willayy · 2025-04-23T12:05:57Z

Description

I have added the possibility to change the implementation behind our tensor operations multiple times. You do, however, need to set it for all the types you want. Unfortunately, this is necessary as templates only exist at compile time.

I went for a solution with a static class TensorOperations (a class with only static members) with a static function for add, subtract, multiply ,gemm, sliding window etc... Each method also got its own setter and storage pointer. I went for this approach for these following reasons:

Classes allows us to get visibility modifers so we can hide the storage pointers from tampering which isn't included in namespaces, using classes also follows our object oriented design approach.
Using a simple setter, getter approach we don't have to modify any of our existing code. Its also very simple to work with and does not require a lot of extra hours of trial and error to implement. We could have used something more sophisticated here but since we are running out of time and this solution is acceptable imo.
I moved the avx and intel functions to new files to signal that these are extensions of the default implementations.
I made type aliases for the function types to improve readability.
I decided to keep all functions in a single class since it makes the operations easier to use with no apparent downside.

One thing that is kind of annoying is that I have to make a setter that's the same for all functions which adds a bit of annoying overhead work. If somebody has a good solution to this please tell me.

Issue

Closes #210

…snet18

…s as pointers

…epo-restructuring

ehmc123

This is not quite as I had pictured it. One of my biggest concerns with our codebase now is maintainability, as the project grows it is becoming more unreadable and then also harder to debug.

So the point I was trying to make about modularity on Monday was basically dividing the project by concern. And I believe our code for modularity i.e. factories/function pointers should be completely separate from implementations like our main code that is not hardware specific or hardware specific like avx512 and so on.

So basically what I wanted to see to ensure the maintainability of the projects was separation like this:

/modularity layer ( Whatever name we want )
-- Here we store the architecture/code for the the functions pointers and factories and so. Basically a "layer in between" that isn't concerned about actual implementation of the pointers it just stores the pointers.

/implementations
--/default (not hardware specific can run on all platforms )
-- Here we have like all nodes, tensor, gemm, arithmetic implementations.

--/avx512
-- This maybe only need to change gemm and arithmetic so it only has that.

--/cuda
-- And another hardware specific implementation example

-- All implementations should have an easy install.hpp or something similar, and basically it would quick install all pointers to the modularity layer.

/abstract classes (both modularity layer and implementation will be reliant on abstract classes)

This way we would clearly separate our codebase by concerns increasing readability and improving maintainability, all these folders could even be submodules.

If you have a different idea please comment it, but I believe this a pretty crucial time to think about maintainability.

willayy · 2025-04-24T10:36:52Z

@ehmc123 I will provide a reply this afternoon!

willayy · 2025-04-24T13:21:17Z

@ehmc123

The problem (not a problem I guess) I see with implementing an install function is that for it to be useful you kind of need to make implementations for all functions available in TensorOperations which is often not the case. For example, avx only concerns the gemm functions so If you want to use it you can just use the gemm setter. The only reason I kept this code

  // Pointer to the gemm std::function. Different defaults depend on the compiler
  // flags
#if defined(USE_AVX_GEMM)
  template <TensorConcept::Types T>
  static inline toft::gemm_func<T> gemm_ptr = mml_gemm_avx<T>;
#elif defined(USE_AVX512_GEMM)
  template <TensorConcept::Types T>
  static inline toft::gemm_func<T> gemm_ptr = mml_gemm_avx512<T>;
#elif defined(USE_BLOCKED_GEMM)
  template <TensorConcept::Types T>
  static inline toft::gemm_func<T> gemm_ptr = mml_gemm_blocked<T>;
#else
  template <TensorConcept::Types T>
  static inline toft::gemm_func<T> gemm_ptr = mml_gemm_inner_product<T>;
#endif

Is for legacy reasons. But if you want I think it's a reasonable compromise to remove this part and use the setters where it's necessary to do so. Making it more readable.

But to finish my point about install I don't think its a bad idea necessarily, we could actually use it as a handy shortcommand that just uses the set_pointers but I just think its not that important at this moment.

Also, I'm still not sure what you mean by factories, the TensorFactory is an implementation of the Factory pattern for making tensors but TensorOperations is by definition, not a Factory, I would call it a utility class.

I don't think there is a viable case for making TensorOperations have an abstract base class since it is static and doesn't interact with other parts of the code in a polymorphic way. If you want you could always subclass it and extend but since we already made it possible for the user to change implementations via the setters this is pretty much only for if you want to add new functions to it.

One of my biggest concerns with our codebase now is maintainability, as the project grows it is becoming more unreadable and then also harder to debug.

I agree with you on this one and there are many reasons why that's an issue! But I don't TensorOperations and TensorFactory is a part of the problem here. TensorOperations/TensorFactory does not depend on any implementations at all, yes we set the MML functions as defaults but that's just for the sake of convenience, if we didn't you would always have to start a test or program with set_xyz_ptr... or Install:default... which I just think becomes annoying and many other libraries uses defaults in this manner.

Im also thinking you might mean that the file tree of our project is a bit confusing (which I agree with). I have created a branch called feature/repo-restructuring where I aim to make it more readable. Im very open to collaborate on this if you are interested and have the time!

I think its really important we are united and on the same page for this so I'm very open to continue this discussion! If you want we could also talk on discord I'm available the whole day today!

ehmc123 · 2025-04-24T14:19:59Z

@willayy

Yes mostly what I am asking about is the structure of our file tree, that is a big part of assuring maintainability! But I don't really understand the reasoning of treating hardware specific implementations (like AVX) of Gemm separately when it comes to modularity.

If we truly want runtime modularity I don't see why we should use build time flags to control which implementations is used. Most use cases of our modularity will be probably be hardware specific Gemm or Arithmetic, so I think we should just treat them as separate implementations to keep within the design. I think it is very confusing mixing runtime modularity with build flags.

What I am calling factories is merely a blanket term for the same thing, there are of course better words for this which is why I called it the modularity layer in the comment above, both tensorfactories and tensoroperations although they are using different patterns fill the same purpose of being an abstract layer between implementations which we can choose which why I am lumping them together.

willayy · 2025-04-24T14:23:00Z

@willayy

Yes mostly what I am asking about is the structure of our file tree, that is a big part of assuring maintainability! But I don't really understand the reasoning of treating hardware specific implementations (like AVX) of Gemm separately when it comes to modularity.

If we truly want runtime modularity I don't see why we should use build time flags to control which implementations is used. Most use cases of our modularity will be probably be hardware specific Gemm or Arithmetic, so I think we should just treat them as separate implementations to keep within the design. I think it is very confusing mixing runtime modularity with build flags.

What I am calling factories is merely a blanket term for the same thing, there are of course better words for this which is why I called it the modularity layer in the comment above, both tensorfactories and tensoroperations although they are using different patterns fill the same purpose of being an abstract layer between implementations which we can choose which why I am lumping them together.

@ehmc123 Sure I can understand that, it wasn't my idea to set the the avx instructions using macros. I think its just a solution that somebody implemented "just for the moment, we can change it later". I will remove it!

…o feature/repo-restructuring

…orfactory

…operations-and-tensorfactory

…alid-fix/-refactor-operations-and-tensorfactory

Breman402 and others added 15 commits April 20, 2025 23:18

Add abbility to enable OPENBLAS in the CMakeLists.txt

98d51c6

Add OPENBLAS implementation for GEMM

ab46cdc

Add OPENBLAS implementation for GEMM

8ea4577

Add global_avg_pool node

e4b20f1

include global_avg_pool.hpp

3b88b23

Add global_avg_pooling to mml_parser

340828f

Remove duplicate code and add test for resnet18

0683f21

Add abbility to change what model is being tested and add test for re…

1fcec90

…snet18

hotfix : made imagenet tests test 60 images again

93082be

Move onnxjson

b311132

Add url

5b44888

refactor operations and tensor factory to accepts multi type function…

3d093c1

…s as pointers

Rename tests

be1dc6d

Move cmake messages

5471c07

Add type discrimination

e32631a

willayy added the invalid This doesn't seem right label Apr 23, 2025

willayy requested a review from ehmc123 April 23, 2025 12:05

willayy self-assigned this Apr 23, 2025

willayy linked an issue Apr 23, 2025 that may be closed by this pull request

Invalid: Refactor Operations and TensorFactory #210

Open

willayy and others added 6 commits April 23, 2025 16:45

Move source and header files into sensibles folders

36a246e

Merge branch 'invalid-fix/invalid-fix-linter-warnings' into feature/r…

435b7db

…epo-restructuring

Fix some minor lint errors

85d0852

Improve clang format

a7b60c3

Add requirement comment

425aa69

Add BLAS supprt

c4710fe

ehmc123 reviewed Apr 24, 2025

View reviewed changes

willayy and others added 29 commits April 24, 2025 17:39

Fix merge artifacts

55ee894

Merge branch 'invalid-fix/-refactor-operations-and-tensorfactory' int…

64b952f

…o feature/repo-restructuring

Fix clang format

e4cc634

Add LLM generated docs (Claude 3.7 sonnet)

dda8f03

Add native if running BLAS

57e5a97

Remove incorrect comment and includes

d7864ad

Remove Kernel flipping

9b48929

Merge branch 'main' into feature/BLAS-conv-node

15df258

Change tests to reflect the kernel no longer being flipped

b79e966

Fix cmakelists

97e85b2

Move test source definition

6d59736

Change to project binary and source dir

eafaba6

Fix include to correctly inmclude stb library

5da7556

Test another dir setting for stb

1c18fcc

Remove redundant line in cmakelists

a028d65

Merge branch 'testing' into invalid-fix/-refactor-operations-and-tens…

b42af37

…orfactory

Merge branch 'feature/repo-restructuring' into invalid-fix/-refactor-…

914dc23

…operations-and-tensorfactory

Fix cmake messages

77250b3

Fix intel mkl

0b10b25

Improved folder structure

18b3bf8

Remove multithreaded test

c9623d8

Fix files

461b685

Fix Blocked using tims PR

8e6f075

Add openblas gemm from Mans PR

78ee084

rename operations file

530d9e4

Merge remote-tracking branch 'origin/feature/BLAS-conv-node' into inv…

5760544

…alid-fix/-refactor-operations-and-tensorfactory

Merge artifacts and some small mods

bbeb66d

Declare vars separate

56c3664

Make workflow more lean and remove files dependent on make

57041f1

willayy closed this May 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Invalid: Refactor operations and tensorfactory #211

Fix Invalid: Refactor operations and tensorfactory #211

Uh oh!

willayy commented Apr 23, 2025

Uh oh!

ehmc123 left a comment •

edited

Loading

Uh oh!

willayy commented Apr 24, 2025

Uh oh!

willayy commented Apr 24, 2025

Uh oh!

ehmc123 commented Apr 24, 2025

Uh oh!

willayy commented Apr 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix Invalid: Refactor operations and tensorfactory #211

Fix Invalid: Refactor operations and tensorfactory #211

Uh oh!

Conversation

willayy commented Apr 23, 2025

Description

Issue

Uh oh!

ehmc123 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

willayy commented Apr 24, 2025

Uh oh!

willayy commented Apr 24, 2025

Uh oh!

ehmc123 commented Apr 24, 2025

Uh oh!

willayy commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ehmc123 left a comment •

edited

Loading

willayy commented Apr 24, 2025 •

edited

Loading