Skip to content

Optimize FP8 gemm for prefill#84

Draft
blzheng wants to merge 2 commits intomingfeima:cpu_opt_ww11from
blzheng:beilei/opt_prefill
Draft

Optimize FP8 gemm for prefill#84
blzheng wants to merge 2 commits intomingfeima:cpu_opt_ww11from
blzheng:beilei/opt_prefill

Conversation

@blzheng
Copy link
Collaborator

@blzheng blzheng commented Jun 19, 2025

Motivation

Modifications

Checklist

mingfeima and others added 2 commits May 29, 2025 19:35
before:
```
gemm_bf16(native): 4.772 ms, gemm_fp8(opt): 0.000 ms, gemm_int8(opt): 0.000 ms, gemm_bf16(opt): 15.328 ms
```

after:
```
gemm_bf16(native): 4.847 ms, gemm_fp8(opt): 0.000 ms, gemm_int8(opt): 0.000 ms, gemm_bf16(opt): 3.927 ms
```
CaoE pushed a commit to CaoE/sglang that referenced this pull request Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants