😼
When softmax attention is sus
MTS at Magic and ML researcher.
I force rocks to learn things.
Highlights
- Pro
Pinned Loading
-
Stable-Diffusion-3-From-Scratch
Stable-Diffusion-3-From-Scratch PublicA repo that attempts to train stable diffusion 3 from scratch
-
On-the-Expressiveness-of-Softmax-Attention-A-Recurrent-Neural-Network-Perspective
On-the-Expressiveness-of-Softmax-Attention-A-Recurrent-Neural-Network-Perspective PublicJupyter Notebook 4
-
2Mamba2Furious
2Mamba2Furious PublicCode for the paper "2Mamba2Furious: Linear in complexity, competitive in accuracy"
Jupyter Notebook 2
-
Cottention_Transformer
Cottention_Transformer PublicCode for the paper "Cottention: Linear Transformers With Cosine Attention"
Cuda 20
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.




