Agentic Enterprise Security Scanner Version 1.0 | December 2025
Dandayudham is an intelligent, agentic vulnerability and secret scanner that combines the adaptive reasoning of Nvidia Nemotron-Orchestrator-8B with specialized ML models (CodeBERT, SecretBERT) and industry-standard security tools. Unlike traditional scanners, Dandayudham understands organizational context, learns from patterns, and dynamically adapts its scanning strategy based on repository characteristics.
- Agentic Intelligence: Nemotron-Orchestrator-8B orchestrates scans, deciding which tools and models to deploy based on repo analysis
- Specialized ML Models: Fine-tuned CodeBERT catches vulnerability patterns, SecretBERT detects secrets with 98% accuracy
- Organizational Learning: Builds "stack fingerprints" and learns what's critical for each company
- Tool Ecosystem Integration: Leverages Semgrep, Gitleaks, OSV, Trivy via unified interface
- Context-Aware Ranking: Understands blast radius, historical severity, and org-specific priorities
Enterprise companies with 100+ repositories on GitHub/GitLab
| Problem | Impact |
|---|---|
| Alert Fatigue | Traditional scanners generate 1000s of findings, 70-80% are false positives |
| No Context Understanding | Can't differentiate "API key in test fixture" vs "production AWS key in main branch" |
| One-Size-Fits-All | Same rules for fintech companies and e-commerce platforms |
| Siloed Tools | Teams run 5-7 different security tools, manually correlating results |
| No Learning | Scanners don't learn from developer feedback or past incidents |
| Poor Prioritization | Critical vulnerabilities buried under low-severity noise |
- <5% False Positive Rate (industry standard: 20-30%)
- Scan Time <10 minutes for 1M LOC repositories
- 95%+ Precision on critical/high severity findings
- Developer Trust Score >4.2/5 within 3 months
- 40% Reduction in time-to-fix for vulnerabilities
┌─────────────────────────────────────────────────────────────────────────┐
│ DANDAYUDHAM CONTROL PLANE │
│ (Nemotron-Orchestrator-8B - Self-Hosted) │
│ │
│ • Repository Analysis & Strategy Planning │
│ • Dynamic Tool Selection & Coordination │
│ • Multi-turn Agentic Task Management │
│ • Result Synthesis & Ranking │
└─────────────────────────────────────────────────────────────────────────┘
│
┌──────────────────────┼──────────────────────┐
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ ML Models │ │ Static Tools │ │ Worker Agents │
│ (Self-Hosted) │ │ (Open Source) │ │ (Parallel) │
│ │ │ │ │ │
│ • CodeBERT │ │ • Semgrep │ │ Qwen3-8B / │
│ • SecretBERT │ │ • Gitleaks │ │ Mistral-7B │
│ • BGE-Large │ │ • OSV Scanner │ │ │
│ │ │ • Trivy │ │ • File scan │
└────────────────┘ └────────────────┘ │ • Context │
│ • Validation │
└────────────────┘
│
▼
┌─────────────────────────┐
│ Knowledge Store │
│ │
│ • PostgreSQL (relational)
│ • Qdrant (vectors) │
│ • Redis (cache/queue) │
└─────────────────────────┘
- Name: Priya, AppSec Lead
- Goals: Prevent secrets in production, reduce vulnerability backlog, prove security posture
- Needs: Accurate findings, clear explanations, integration with Jira/Slack
- Name: Raj, Backend Team Lead
- Goals: Ship features fast, maintain security standards, keep team unblocked
- Needs: Non-blocking scans, learn from team's codebase, minimal disruption
- Name: Aisha, Chief Information Security Officer
- Goals: Demonstrate compliance, reduce breach risk, metrics for board
- Needs: Executive dashboard, trend analysis, compliance reports
| Component | Technology | Rationale |
|---|---|---|
| Orchestrator | Nemotron-Orchestrator-8B | 8B params, outperforms GPT-5 on HLE, 2.5x more efficient |
| Worker LLM | Qwen3-8B / Mistral-7B | Cost-effective for parallel validation tasks |
| Vulnerability Detection | CodeBERT (fine-tuned) | SOTA for code understanding |
| Secret Detection | SecretBERT (custom) | 98% precision on secrets |
| Embeddings | BGE-Large-en-v1.5 | Best multilingual code embeddings |
| Frontend | Next.js 16 + React 19 | Modern, server-side rendering, SEO-friendly |
| Backend | Python 3.11+, FastAPI, SQLAlchemy (Async) | Python for ML, Go for performance-critical paths |
| Database | PostgreSQL + Qdrant | Relational + Vector storage |
| Queue | Redis | Simple, reliable, fast |
| Infrastructure | Docker + Kubernetes | Scalable, industry standard |
- Python 3.11+
- Docker & Docker Compose
- NVIDIA GPU (for ML inference)
- Node.js 20+ (for frontend)
# Clone the repository
git clone https://github.com/your-org/dandayudham.git
cd dandayudham
# Copy environment file
cp .env.example .env
# Start all services with Docker
docker-compose -f docker/docker-compose.yml up -d
# Access points
# Dashboard: http://localhost:3000
# API Docs: http://localhost:8000/docs
# API: http://localhost:8000/api/v1curl -X POST http://localhost:8000/api/v1/scan/trigger \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://github.com/example/repo",
"branch": "main",
"scan_type": "full"
}'dandayudham/
├── apps/
│ ├── api/ # FastAPI main gateway
│ ├── ml-service/ # ML model inference
│ ├── orchestrator/ # Nemotron orchestration
│ ├── worker/ # Go-based scanner worker
│ └── frontend/ # React dashboard
├── integrations/ # GitHub, Jira, Slack
├── tools/ # CLI utilities
├── config/ # Semgrep rules, configs
├── docker/ # Docker configurations
└── k8s/ # Kubernetes manifests
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/scan/trigger |
Trigger a new scan |
| GET | /api/v1/scan/{id}/status |
Get scan status |
| GET | /api/v1/scan/{id}/results |
Get scan results |
| POST | /api/v1/findings/{id}/feedback |
Submit FP/TP feedback |
| GET | /api/v1/orgs/{id}/dashboard |
Organization dashboard |
| POST | /api/v1/webhooks/github |
GitHub webhook handler |
| Metric | Target |
|---|---|
| False Positive Rate | <5% |
| Precision (Critical) | >95% |
| Scan Time (1M LOC) | <10 min |
| API Latency (p95) | <500ms |
| Developer Trust Score | >4.2/5 |
- Performance: Full scan of 1M LOC repo in <10 minutes
- Scalability: Handle 10,000 concurrent scans, 100M LOC/day
- Security: SOC 2 Type II compliant, encryption at rest/transit
- Reliability: 99.9% uptime SLA, automatic retries, graceful degradation
MIT License - See LICENSE for details.