Skip to content

Conversation

@moven0831
Copy link
Collaborator

with initial benchmark on iOS as below

2^14: 319 ms
2^15: 431 ms
2^16: 512 ms
2^17: 654 ms
2^18: 915 ms
2^19: 1335 ms
2^20: 2158 ms
2^21: 3690 ms
2^22: 6855 ms

compared with our v1 MSM implementation (a553c49 and docs here: https://hackmd.io/@FoodChain/SJRjE0Nh6), this shows significant improvement on

  1. better timing (e.g. for 2^20 test case, v1: 41019 ms; v2: 2158 ms. ~20x improvement)
  2. better resource management (e.g. memory)
  3. better GPU usage of ALU/f32 uints

@moven0831 moven0831 self-assigned this Jun 5, 2025
@moven0831 moven0831 merged commit e17b128 into main Jun 6, 2025
2 checks passed
@moven0831 moven0831 deleted the feat/example-ios-app branch June 6, 2025 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update gpu-acceleration-app with latest mopro v0.2.0 integration

2 participants