SarathL754

SarathL754

Pinned Loading

Reducing-Hallucinations-with-Direct-Preference-Optimization Reducing-Hallucinations-with-Direct-Preference-Optimization Public

An RLHF-inspired DPO framework that explicitly teaches LLMs when to refuse, significantly reducing hallucinations.
Decision-Transformer-from-Scratch-HalfCheetah-Minari-BC-vs-Return-Conditioned-DT Decision-Transformer-from-Scratch-HalfCheetah-Minari-BC-vs-Return-Conditioned-DT Public

Implementing Decision Transformers from scratch for offline RL, benchmarking return-conditioned policies against Behavior Cloning.

Python
VulneraAI-agent VulneraAI-agent Public

An agentic LLM security scanner that analyzes applications against OWASP Top 10 using tool-calling, LangGraph, and AWS Bedrock.

Python
Email-Assistant-langgraph Email-Assistant-langgraph Public

Python
Multi-agent-RL-texas-holdem-aec Multi-agent-RL-texas-holdem-aec Public

An engineering-focused multi-agent reinforcement learning system for Texas Hold’em using PettingZoo AEC and a custom PyTorch PPO self-play setup.

Python
Alzheimer-Disease-Stage-Classification-CNNs-vs-Transformers- Alzheimer-Disease-Stage-Classification-CNNs-vs-Transformers- Public

A comparative study of CNNs vs Vision Transformers for Alzheimer’s disease stage classification on brain MRI, with detailed error and performance analysis

Jupyter Notebook