-
Notifications
You must be signed in to change notification settings - Fork 32
sign stack usage: compute z incrementally #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0ee075f to
6c05449
Compare
Alternative to #822 that I hope to be less controversial. Currently the constant time tests for verification rely on the signature being declassified at the end of verification. This is not ideal. This commit moves this declassification to the constant-time test instead. As suggested in #822 (review), there is more work left to clean up the story around declassifications. This PR is a first step towards cleaning up that story to unblock #825 and #821, but there is more work left. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
6c05449 to
3026327
Compare
Alternative to #822 that I hope to be less controversial. Currently the constant time tests for verification rely on the signature being declassified at the end of verification. This is not ideal. This commit moves this declassification to the constant-time test instead. As suggested in #822 (review), there is more work left to clean up the story around declassifications. This PR is a first step towards cleaning up that story to unblock #825 and #821, but there is more work left. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
b91f0c7 to
5b0673e
Compare
hanno-becker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good overall, thank you @mkannwischer! Some smaller comments
566f25e to
5ecdabc
Compare
5ecdabc to
af4a6a2
Compare
This commit reduces the stack usage of signing by computing z = y + s1*cp incrementally (one polynomial at a time) allowing to eliminate the polyvecl z (at to cost of a single poly z). The computation of z is moved into a separate function (compute_pack_z) to vastly speed up the CBMC proofs. De-facto this saves L-1 KB irrespective of MLD_CONFIG_REDUCE_RAM. Practically, the same buffer was used early in the function too. Here we instead introduce a new polyvecl buffer tmp, but that can be placed in a union together with w1. Unfortuantely, with the current struct workaround for diffblue/cbmc#8813, this results in an increase in stack space by L KB. This gets eliminated when MLD_CONFIG_REDUCE_RAM is set. Hoisted out from #791 Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
For some reason the previous (unrelated) commit caused the verify_pre_hash_internal proof to fail due to extra functions in USE_FUNCTION_CONTRACTS. This commit removes the extra functions. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
af4a6a2 to
d28fb9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
46412 cycles |
46417 cycles |
1.00 |
ML-DSA-44 sign |
132123 cycles |
131302 cycles |
1.01 |
ML-DSA-44 verify |
47772 cycles |
47780 cycles |
1.00 |
ML-DSA-65 keypair |
81327 cycles |
81321 cycles |
1.00 |
ML-DSA-65 sign |
217249 cycles |
216907 cycles |
1.00 |
ML-DSA-65 verify |
80053 cycles |
80071 cycles |
1.00 |
ML-DSA-87 keypair |
132572 cycles |
132544 cycles |
1.00 |
ML-DSA-87 sign |
278635 cycles |
277775 cycles |
1.00 |
ML-DSA-87 verify |
130415 cycles |
130361 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
114381 cycles |
114381 cycles |
1 |
ML-DSA-44 sign |
418716 cycles |
428553 cycles |
0.98 |
ML-DSA-44 verify |
122380 cycles |
122432 cycles |
1.00 |
ML-DSA-65 keypair |
195803 cycles |
195935 cycles |
1.00 |
ML-DSA-65 sign |
683828 cycles |
696798 cycles |
0.98 |
ML-DSA-65 verify |
197606 cycles |
197557 cycles |
1.00 |
ML-DSA-87 keypair |
323082 cycles |
322999 cycles |
1.00 |
ML-DSA-87 sign |
866048 cycles |
879409 cycles |
0.98 |
ML-DSA-87 verify |
328303 cycles |
328164 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
276165 cycles |
277431 cycles |
1.00 |
ML-DSA-44 sign |
818209 cycles |
828306 cycles |
0.99 |
ML-DSA-44 verify |
274469 cycles |
274630 cycles |
1.00 |
ML-DSA-65 keypair |
472161 cycles |
472268 cycles |
1.00 |
ML-DSA-65 sign |
1338336 cycles |
1350849 cycles |
0.99 |
ML-DSA-65 verify |
449216 cycles |
449649 cycles |
1.00 |
ML-DSA-87 keypair |
804093 cycles |
802074 cycles |
1.00 |
ML-DSA-87 sign |
1813859 cycles |
1808884 cycles |
1.00 |
ML-DSA-87 verify |
774374 cycles |
771390 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
34600 cycles |
34698 cycles |
1.00 |
ML-DSA-44 sign |
121014 cycles |
120452 cycles |
1.00 |
ML-DSA-44 verify |
38094 cycles |
38255 cycles |
1.00 |
ML-DSA-65 keypair |
61083 cycles |
61720 cycles |
0.99 |
ML-DSA-65 sign |
202128 cycles |
200582 cycles |
1.01 |
ML-DSA-65 verify |
62534 cycles |
62863 cycles |
0.99 |
ML-DSA-87 keypair |
94058 cycles |
93574 cycles |
1.01 |
ML-DSA-87 sign |
235645 cycles |
231081 cycles |
1.02 |
ML-DSA-87 verify |
93724 cycles |
93987 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
94261 cycles |
96015 cycles |
0.98 |
ML-DSA-44 sign |
333473 cycles |
344139 cycles |
0.97 |
ML-DSA-44 verify |
100019 cycles |
100919 cycles |
0.99 |
ML-DSA-65 keypair |
161116 cycles |
160878 cycles |
1.00 |
ML-DSA-65 sign |
546754 cycles |
548011 cycles |
1.00 |
ML-DSA-65 verify |
161418 cycles |
161909 cycles |
1.00 |
ML-DSA-87 keypair |
268049 cycles |
267516 cycles |
1.00 |
ML-DSA-87 sign |
715711 cycles |
710718 cycles |
1.01 |
ML-DSA-87 verify |
269969 cycles |
269917 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
69242 cycles |
69243 cycles |
1.00 |
ML-DSA-44 sign |
188051 cycles |
184050 cycles |
1.02 |
ML-DSA-44 verify |
69161 cycles |
69175 cycles |
1.00 |
ML-DSA-65 keypair |
119740 cycles |
119178 cycles |
1.00 |
ML-DSA-65 sign |
303540 cycles |
294814 cycles |
1.03 |
ML-DSA-65 verify |
115187 cycles |
115292 cycles |
1.00 |
ML-DSA-87 keypair |
204195 cycles |
204047 cycles |
1.00 |
ML-DSA-87 sign |
396708 cycles |
387431 cycles |
1.02 |
ML-DSA-87 verify |
195232 cycles |
195361 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
56501 cycles |
56515 cycles |
1.00 |
ML-DSA-44 sign |
182196 cycles |
179924 cycles |
1.01 |
ML-DSA-44 verify |
61023 cycles |
61223 cycles |
1.00 |
ML-DSA-65 keypair |
98891 cycles |
99047 cycles |
1.00 |
ML-DSA-65 sign |
301535 cycles |
296975 cycles |
1.02 |
ML-DSA-65 verify |
100427 cycles |
100792 cycles |
1.00 |
ML-DSA-87 keypair |
152820 cycles |
153381 cycles |
1.00 |
ML-DSA-87 sign |
357365 cycles |
353572 cycles |
1.01 |
ML-DSA-87 verify |
153222 cycles |
153278 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
68812 cycles |
68650 cycles |
1.00 |
ML-DSA-44 sign |
203587 cycles |
202247 cycles |
1.01 |
ML-DSA-44 verify |
70760 cycles |
70601 cycles |
1.00 |
ML-DSA-65 keypair |
121769 cycles |
121601 cycles |
1.00 |
ML-DSA-65 sign |
334123 cycles |
331430 cycles |
1.01 |
ML-DSA-65 verify |
118199 cycles |
117927 cycles |
1.00 |
ML-DSA-87 keypair |
199203 cycles |
198920 cycles |
1.00 |
ML-DSA-87 sign |
430244 cycles |
428106 cycles |
1.00 |
ML-DSA-87 verify |
194450 cycles |
195053 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
135410 cycles |
135092 cycles |
1.00 |
ML-DSA-44 sign |
527859 cycles |
534269 cycles |
0.99 |
ML-DSA-44 verify |
147490 cycles |
147219 cycles |
1.00 |
ML-DSA-65 keypair |
227108 cycles |
226958 cycles |
1.00 |
ML-DSA-65 sign |
862532 cycles |
870870 cycles |
0.99 |
ML-DSA-65 verify |
235084 cycles |
235062 cycles |
1.00 |
ML-DSA-87 keypair |
371417 cycles |
371508 cycles |
1.00 |
ML-DSA-87 sign |
1082333 cycles |
1086997 cycles |
1.00 |
ML-DSA-87 verify |
382983 cycles |
383200 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
157796 cycles |
159017 cycles |
0.99 |
ML-DSA-44 sign |
550650 cycles |
568055 cycles |
0.97 |
ML-DSA-44 verify |
169552 cycles |
170439 cycles |
0.99 |
ML-DSA-65 keypair |
268363 cycles |
268942 cycles |
1.00 |
ML-DSA-65 sign |
906244 cycles |
916893 cycles |
0.99 |
ML-DSA-65 verify |
273642 cycles |
274044 cycles |
1.00 |
ML-DSA-87 keypair |
448947 cycles |
448757 cycles |
1.00 |
ML-DSA-87 sign |
1161143 cycles |
1171199 cycles |
0.99 |
ML-DSA-87 verify |
458213 cycles |
457924 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
72708 cycles |
72739 cycles |
1.00 |
ML-DSA-44 sign |
214077 cycles |
212398 cycles |
1.01 |
ML-DSA-44 verify |
75636 cycles |
75721 cycles |
1.00 |
ML-DSA-65 keypair |
128015 cycles |
128130 cycles |
1.00 |
ML-DSA-65 sign |
354152 cycles |
350108 cycles |
1.01 |
ML-DSA-65 verify |
125570 cycles |
125368 cycles |
1.00 |
ML-DSA-87 keypair |
206752 cycles |
209477 cycles |
0.99 |
ML-DSA-87 sign |
448406 cycles |
450700 cycles |
0.99 |
ML-DSA-87 verify |
205849 cycles |
205401 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4 (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
128947 cycles |
128725 cycles |
1.00 |
ML-DSA-44 sign |
449000 cycles |
456379 cycles |
0.98 |
ML-DSA-44 verify |
136818 cycles |
137029 cycles |
1.00 |
ML-DSA-65 keypair |
220984 cycles |
220722 cycles |
1.00 |
ML-DSA-65 sign |
729869 cycles |
737657 cycles |
0.99 |
ML-DSA-65 verify |
221208 cycles |
221346 cycles |
1.00 |
ML-DSA-87 keypair |
365833 cycles |
365707 cycles |
1.00 |
ML-DSA-87 sign |
929448 cycles |
939501 cycles |
0.99 |
ML-DSA-87 verify |
370148 cycles |
370387 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
41523 cycles |
42364 cycles |
0.98 |
ML-DSA-44 sign |
134242 cycles |
133559 cycles |
1.01 |
ML-DSA-44 verify |
44664 cycles |
45481 cycles |
0.98 |
ML-DSA-65 keypair |
72408 cycles |
72453 cycles |
1.00 |
ML-DSA-65 sign |
216861 cycles |
211467 cycles |
1.03 |
ML-DSA-65 verify |
73309 cycles |
73397 cycles |
1.00 |
ML-DSA-87 keypair |
107598 cycles |
110157 cycles |
0.98 |
ML-DSA-87 sign |
251916 cycles |
254019 cycles |
0.99 |
ML-DSA-87 verify |
109156 cycles |
111808 cycles |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
114000 cycles |
113905 cycles |
1.00 |
ML-DSA-44 sign |
358680 cycles |
360308 cycles |
1.00 |
ML-DSA-44 verify |
117919 cycles |
117940 cycles |
1.00 |
ML-DSA-65 keypair |
197620 cycles |
197696 cycles |
1.00 |
ML-DSA-65 sign |
594164 cycles |
594054 cycles |
1.00 |
ML-DSA-65 verify |
194968 cycles |
194953 cycles |
1.00 |
ML-DSA-87 keypair |
324651 cycles |
324147 cycles |
1.00 |
ML-DSA-87 sign |
758979 cycles |
759328 cycles |
1.00 |
ML-DSA-87 verify |
320806 cycles |
320484 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
465085 cycles |
464882 cycles |
1.00 |
ML-DSA-44 sign |
2150114 cycles |
2213472 cycles |
0.97 |
ML-DSA-44 verify |
547454 cycles |
547002 cycles |
1.00 |
ML-DSA-65 keypair |
779566 cycles |
780337 cycles |
1.00 |
ML-DSA-65 sign |
3521130 cycles |
3593147 cycles |
0.98 |
ML-DSA-65 verify |
848308 cycles |
848702 cycles |
1.00 |
ML-DSA-87 keypair |
1256100 cycles |
1257888 cycles |
1.00 |
ML-DSA-87 sign |
4343431 cycles |
4428098 cycles |
0.98 |
ML-DSA-87 verify |
1363073 cycles |
1361393 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3 (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
139121 cycles |
138888 cycles |
1.00 |
ML-DSA-44 sign |
485471 cycles |
492916 cycles |
0.98 |
ML-DSA-44 verify |
148368 cycles |
148562 cycles |
1.00 |
ML-DSA-65 keypair |
243026 cycles |
242771 cycles |
1.00 |
ML-DSA-65 sign |
796203 cycles |
804857 cycles |
0.99 |
ML-DSA-65 verify |
240904 cycles |
241160 cycles |
1.00 |
ML-DSA-87 keypair |
396786 cycles |
397265 cycles |
1.00 |
ML-DSA-87 sign |
1016444 cycles |
1026974 cycles |
0.99 |
ML-DSA-87 verify |
401593 cycles |
402066 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
122581 cycles |
120166 cycles |
1.02 |
ML-DSA-44 sign |
455680 cycles |
452709 cycles |
1.01 |
ML-DSA-44 verify |
131381 cycles |
130091 cycles |
1.01 |
ML-DSA-65 keypair |
204249 cycles |
204195 cycles |
1.00 |
ML-DSA-65 sign |
728522 cycles |
738873 cycles |
0.99 |
ML-DSA-65 verify |
210149 cycles |
209224 cycles |
1.00 |
ML-DSA-87 keypair |
337896 cycles |
337432 cycles |
1.00 |
ML-DSA-87 sign |
925559 cycles |
934927 cycles |
0.99 |
ML-DSA-87 verify |
346908 cycles |
346862 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2 (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
213689 cycles |
213754 cycles |
1.00 |
ML-DSA-44 sign |
763922 cycles |
779981 cycles |
0.98 |
ML-DSA-44 verify |
229483 cycles |
229404 cycles |
1.00 |
ML-DSA-65 keypair |
381937 cycles |
381467 cycles |
1.00 |
ML-DSA-65 sign |
1256384 cycles |
1282011 cycles |
0.98 |
ML-DSA-65 verify |
371810 cycles |
371341 cycles |
1.00 |
ML-DSA-87 keypair |
607103 cycles |
605515 cycles |
1.00 |
ML-DSA-87 sign |
1601574 cycles |
1618590 cycles |
0.99 |
ML-DSA-87 verify |
618524 cycles |
617714 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
244241 cycles |
227155 cycles |
1.08 |
ML-DSA-44 sign |
661391 cycles |
634926 cycles |
1.04 |
ML-DSA-44 verify |
235482 cycles |
223942 cycles |
1.05 |
ML-DSA-65 keypair |
395832 cycles |
396222 cycles |
1.00 |
ML-DSA-65 sign |
1036776 cycles |
1033955 cycles |
1.00 |
ML-DSA-65 verify |
373209 cycles |
382272 cycles |
0.98 |
ML-DSA-87 keypair |
692726 cycles |
662437 cycles |
1.05 |
ML-DSA-87 sign |
1433908 cycles |
1393666 cycles |
1.03 |
ML-DSA-87 verify |
670754 cycles |
647777 cycles |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
244241 cycles |
227155 cycles |
1.08 |
ML-DSA-44 sign |
661391 cycles |
634926 cycles |
1.04 |
ML-DSA-44 verify |
235482 cycles |
223942 cycles |
1.05 |
ML-DSA-87 keypair |
692726 cycles |
662437 cycles |
1.05 |
ML-DSA-87 verify |
670754 cycles |
647777 cycles |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
315610 cycles |
331789 cycles |
0.95 |
ML-DSA-44 sign |
1176470 cycles |
1306541 cycles |
0.90 |
ML-DSA-44 verify |
344598 cycles |
363529 cycles |
0.95 |
ML-DSA-65 keypair |
568926 cycles |
607535 cycles |
0.94 |
ML-DSA-65 sign |
1954395 cycles |
2078057 cycles |
0.94 |
ML-DSA-65 verify |
541671 cycles |
577791 cycles |
0.94 |
ML-DSA-87 keypair |
857076 cycles |
893687 cycles |
0.96 |
ML-DSA-87 sign |
2427084 cycles |
2605336 cycles |
0.93 |
ML-DSA-87 verify |
892479 cycles |
928223 cycles |
0.96 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
831908 cycles |
830625 cycles |
1.00 |
ML-DSA-44 sign |
3243025 cycles |
3342511 cycles |
0.97 |
ML-DSA-44 verify |
921924 cycles |
921888 cycles |
1.00 |
ML-DSA-65 keypair |
1410709 cycles |
1405524 cycles |
1.00 |
ML-DSA-65 sign |
5330162 cycles |
5453246 cycles |
0.98 |
ML-DSA-65 verify |
1473728 cycles |
1471261 cycles |
1.00 |
ML-DSA-87 keypair |
2312729 cycles |
2307928 cycles |
1.00 |
ML-DSA-87 sign |
6666114 cycles |
6787920 cycles |
0.98 |
ML-DSA-87 verify |
2410121 cycles |
2408200 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
113886 cycles |
113817 cycles |
1.00 |
ML-DSA-44 sign |
358367 cycles |
359993 cycles |
1.00 |
ML-DSA-44 verify |
117782 cycles |
117902 cycles |
1.00 |
ML-DSA-65 keypair |
197634 cycles |
197711 cycles |
1.00 |
ML-DSA-65 sign |
593784 cycles |
593720 cycles |
1.00 |
ML-DSA-65 verify |
194997 cycles |
194920 cycles |
1.00 |
ML-DSA-87 keypair |
324116 cycles |
323552 cycles |
1.00 |
ML-DSA-87 sign |
758139 cycles |
758497 cycles |
1.00 |
ML-DSA-87 verify |
320455 cycles |
320209 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)
Details
| Benchmark suite | Current: d28fb9f | Previous: 75ab215 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
213239 cycles |
213302 cycles |
1.00 |
ML-DSA-44 sign |
762799 cycles |
783978 cycles |
0.97 |
ML-DSA-44 verify |
229058 cycles |
229037 cycles |
1.00 |
ML-DSA-65 keypair |
381608 cycles |
381483 cycles |
1.00 |
ML-DSA-65 sign |
1256317 cycles |
1272131 cycles |
0.99 |
ML-DSA-65 verify |
371520 cycles |
371484 cycles |
1.00 |
ML-DSA-87 keypair |
606449 cycles |
605847 cycles |
1.00 |
ML-DSA-87 sign |
1603106 cycles |
1617275 cycles |
0.99 |
ML-DSA-87 verify |
618254 cycles |
618103 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
This commit reduces the stack usage of signing by computing z = y + s1*cp
incrementally (one polynomial at a time) allowing to eliminate the polyvecl
z (at to cost of a single poly z).
The computation of z is moved into a separate function (compute_pack_z) to
vastly speed up the CBMC proofs.
De-facto this saves L-1 KB irrespective of MLD_CONFIG_REDUCE_RAM.
Practically, the same buffer was used early in the function too. Here we
instead introduce a new polyvecl buffer tmp, but that can be placed in a union
together with w1.
Unfortuantely, with the current struct workaround for
diffblue/cbmc#8813, this results in an increase in
stack space by L KB.
This gets eliminated when MLD_CONFIG_REDUCE_RAM is set.
crypto_sign_signature_internalstack usage #791