Skip to content

Support pluggable LLM backends per analyzer (Bedrock, SageMaker, custom endpoints) #11

@rbpotter-aws

Description

@rbpotter-aws

Description of Need

As a BADGERS user, I want to configure which LLM backend and model is used for each analyzer task, including non-Bedrock options like SageMaker endpoints or self-hosted container models, So that I can use smaller, fit-for-purpose models that match my cost, latency, and capability requirements rather than being constrained to Bedrock Claude models for every task.

Context

Currently the system is tightly coupled to Amazon Bedrock as the sole LLM provider. Customers are running specialized or fine-tuned models on SageMaker endpoints, ECS/EKS containers, or want to use different Bedrock models (e.g. Haiku, Mistral, Llama) depending on the analysis task. A document classification task doesn't need the same model horsepower as a complex infographic analysis.

Example Use Cases

  • Use a fine-tuned SageMaker model for domain-specific document classification
  • Use a lightweight Bedrock model (Haiku, Llama) for simple extraction tasks
  • Use a self-hosted model behind an API endpoint for air-gapped or compliance-restricted environments
  • Keep Sonnet/Opus for complex multi-step analysis where quality matters most

Acceptance Criteria

  • Introduce an LLM provider abstraction that supports multiple backends behind a common interface
  • Support at minimum: Bedrock (existing), SageMaker endpoints, and generic HTTP/REST endpoints
  • Users can configure the provider and model per analyzer in the analyzer config (e.g. analyzer_config.json or equivalent)
  • Each provider handles its own auth, request formatting, and response parsing
  • A sensible default (current Bedrock behavior) is used when no override is provided
  • Existing deployments continue to work with zero configuration changes
  • Response format normalization so downstream processing is provider-agnostic

Technical Considerations

  • The bedrock_client.py in the foundation layer is the current integration point that would need to be abstracted
  • SageMaker endpoints use invoke_endpoint with potentially different payload formats
  • Custom endpoints will need configurable auth (IAM, API key, none) and payload templates
  • Prompt formatting may differ across models (chat vs. completion, system prompt support, etc.)

Out of Scope (future consideration)

  • Automatic model routing/selection based on task complexity
  • Model performance benchmarking framework
  • Fallback chains (try model A, fall back to model B)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions