Use TORCH_LIBRARY to register function by yanbing-j · Pull Request #60 · mingfeima/sglang

yanbing-j · 2025-04-16T09:01:09Z

Motivation

Modifications

std::optional<at::Tensor>& will use const std::optional<at::Tensor>& instead, to pass build errors of cannot bind non-const lvalue reference of type 'std::optional<at::Tensor>&' to an rvalue of type 'std::optional<at::Tensor>' .
PyTorch schema float equals C++ double.

Now use torch.ops.sgl_kernel.silu_and_mul_cpu to call the function in cpu.py. No need to change the wrapper in deepseek model, still use sgl_kernel.cpu.silu_and_mul.

shm_allreduce is WIP.

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

gau-nernst · 2025-05-08T09:27:56Z

I'm experimenting with torch.compile and also need to convert PyBind11 module to PyTorch custom op. Initial results have been promising so far. Will share more details when I have something more complete.

shm_allreduce was also a blocker for me due to ProcessGroup and ReduceOp argument. I think we can follow PyTorch c10d_functional approach, which convert ProcessGroup to group_name string and ReduceOp to op_name string

https://github.com/pytorch/pytorch/blob/2f09e7914268ff5c5e4a4a54158fcf93d5f528dd/torch/csrc/distributed/c10d/Functional.cpp#L239-L241

yanbing-j added 2 commits April 16, 2025 02:41

Use TORCH_LIBRARY to register function

b58eed1

Use original to register shm_allreduce

5fe0ac5

gau-nernst mentioned this pull request May 9, 2025

RFC: Use torch.compile to reduce Python overhead #73

Draft

CaoE pushed a commit to CaoE/sglang that referenced this pull request Aug 14, 2025

gpt-oss: enable int4 loading (mingfeima#60)

3b2da9b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use TORCH_LIBRARY to register function#60

Use TORCH_LIBRARY to register function#60
yanbing-j wants to merge 2 commits intomingfeima:cpu_opt_ww11from
yanbing-j:yanbing/py_bind

yanbing-j commented Apr 16, 2025

Uh oh!

gau-nernst commented May 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yanbing-j commented Apr 16, 2025

Motivation

Modifications

Checklist

Uh oh!

gau-nernst commented May 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants