Skip to content
View vedik2002's full-sized avatar
:octocat:
Focusing
:octocat:
Focusing

Block or report vedik2002

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vedik2002/README.md

Hi there, I'm Vedik Agarwal πŸ‘‹

GitHub β€’ LinkedIn β€’ Email β€’ Featured Project


πŸ‘¨πŸ»β€πŸ’» About Me

πŸŽ“ Master’s in Computer Science (Machine Learning Track) at Columbia University
🧠 ML & Systems Engineer working on LLM inference, federated learning, and ML infrastructure
⚑ Interested in memory-efficient attention, distributed optimization, and production ML systems
πŸ“„ IEEE-published researcher in federated intrusion detection


🧠 Core Focus Areas

  • Long-context & memory-efficient LLM inference
  • Federated learning under non-IID distributions
  • Distributed ML systems & microservices
  • Privacy-preserving ML (Healthcare & IoT)
  • High-performance model serving

πŸ› οΈ Tech Stack

Languages

ML / DL

Systems & Infra

Databases / Cloud

πŸš€ Featured Projects & Publications

🧠 Memory-Efficient LLM Inference with Paged FlexAttention

πŸ”— https://github.com/vedik2002/PagedFlexAttention

  • Integrated paged KV caching with PyTorch FlexAttention
  • Supports long-context (4K tokens) & high-concurrency decoding
  • <1% GPU memory overhead with accuracy parity on MMLU & TruthfulQA

πŸ” Fed-MLDL β€” Federated Intrusion Detection (IEEE)

πŸ”— https://ieeexplore.ieee.org/document/10857281

  • Physics-based hyperparameter optimized federated MLP
  • Robust under non-IID distributions
  • Achieved 98% packet classification accuracy

🏭 Industry Experience

Software Developer Intern β€” Trademarkia

  • Distributed Golang microservices scaling to 50K+ requests/day
  • Low-latency BERT inference service (~200ms)
  • Vector search backend using Qdrant + AWS (98% accuracy)

Software Developer Intern β€” Hi Rapid Lab

  • AI healthcare platforms for 10K+ rural patients
  • DPDP-compliant secure medical data systems

Machine Learning Intern β€” Innova Point Infotech

  • Siamese-network-based quality inspection (PyTorch, ResNet)
  • Reduced defect detection time by 45%

πŸ“Š GitHub Analytics


πŸ“« Let’s Connect

Interested in ML systems, LLM inference, or research collaboration?
Feel free to reach out.

Pinned Loading

  1. GaNDLF GaNDLF Public

    Forked from mlcommons/GaNDLF

    A generalizable application framework for segmentation, regression, and classification using PyTorch

    Python

  2. IDS IDS Public

    Jupyter Notebook 1 1

  3. MAF MAF Public

    Forked from Sckarge/MAF

    Copy of MAF repo for research purposes

    Python

  4. PagedFlexAttention PagedFlexAttention Public

    Jupyter Notebook 1

  5. vedik2002 vedik2002 Public