feat: add GPU-native kron support for Diagonal matrices#690
feat: add GPU-native kron support for Diagonal matrices#690shreyas-omkar wants to merge 1 commit intoJuliaGPU:masterfrom
Conversation
|
Nice! One question I have: am I reading the graphs correctly that the |
|
The Dense⊗Diagonal and Diagonal⊗Dense cases use the same amount of memory before and after the fix because the result is always a full dense matrix. In The Diagonal⊗Diagonal case is different. At n = 128, this means going from storing a ~1024 MiB dense matrix to just ~0.06 MiB. The Dense cases only look higher now because the Diagonal⊗Diagonal case dropped so much after the fix. |
|
Oh I see what's going on, the x-axes are not the same. It would be great to show the %/OOM change in time in the future for such plots |
Sure. I'll take a note of it. Thank you. |
|
While we're modifying this logic anyway, would it be possible to provide a lower level integration to https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#Base.kron! so that users can provide/reuse a pre-allocated output |
Sure I will be happy to do this. Adds to my learning. |
7d6ec5f to
15a9a47
Compare
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/src/host/linalg.jl b/src/host/linalg.jl
index 92310d2..1538d2a 100644
--- a/src/host/linalg.jl
+++ b/src/host/linalg.jl
@@ -1004,5 +1004,5 @@ function LinearAlgebra.kron!(C::Diagonal{<:Any, <:AbstractGPUVector}, A::Diagona
end
function LinearAlgebra.kron(A::Diagonal{T1, <:AbstractGPUVector}, B::Diagonal{T2, <:AbstractGPUVector}) where {T1, T2}
- Diagonal(kron(A.diag, B.diag))
+ return Diagonal(kron(A.diag, B.diag))
end |
|
@kshyatt please take a look. |




Feat #668
Diagonal⊗Diagonal: previously required manual densification (diagm), causing O(n⁴) memory blowup (1 GiB at n=128, OOM beyond). Now returns a Diagonal directly with O(n) memory.
Diagonal⊗Dense / Dense⊗Diagonal: previously crashed with scalar indexing errors. Now handled by dedicated GPU kernels.