Backend Dev | ML Engineer | Systems & Inference Engineer
Portfolio • LinkedIn • Twitter
;>>> Passionate about building lightning-fast inference engines — I experiment with CUDA, model parallelization, model switching, and engine-level optimization to squeeze out every last drop of performance.
;>>> Deep into the guts of AI systems, where low-level meets high-impact — whether it’s optimizing tensor operations, customizing runtime backends, or engineering smarter model dispatching pipelines.
;>>> Currently exploring how agents learn via self-play in Reinforcement Learning, and how we can scale these systems efficiently on real hardware.
;>>> I love hacking at the intersection of AI and systems — think inference stacks, GPU kernels, memory layouts, and runtime logic.
;>>> Open to collaborating on LLM inference, embedded AI, or any project that involves making models run faster, leaner, or in unexpected places.
;>>> Fun fact: Nothing gets me more excited than debugging performance bottlenecks or crafting custom CUDA kernels to beat baseline benchmarks.
;>>> When I'm not building, I’m reading cutting-edge research and actually implementing the ideas — because theory is only as good as what runs.
;>>> This is where the most exciting projects are -> HyperKuvidLabs

