Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ datasets
logs/
data_sets
vault/agent-out

# Snyk Security Extension - AI Rules (auto-generated)
.github/instructions/snyk_rules.instructions.md
196 changes: 196 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,202 @@ git commit -m "added package-name dependency"
#### 5. Open a PR
CI will validate that the lockfile and environment are consistent. If you forgot to update the lockfile, the PR will fail with a clear error.

---

## Type Safety Practices

Python is a dynamically typed language. This flexibility makes Python productive and expressive, but it also increases the risk of subtle bugs caused by incorrect function calls, unexpected None values, or inconsistent data structures.To balance flexibility with long-term maintainability we use [Pyright](https://microsoft.github.io/pyright) for CI level type-checking.

We run Pyright in `standard` mode. This mode provides strong type correctness guarantees without requiring the full strictness and annotation overhead of `strict` mode.

You can check the exact type checking constraints enforced in `standard` mode here in the `Diagnostic Defaults` section of the [Pyright documentation](https://microsoft.github.io/pyright/#/configuration?id=diagnostic-settings-defaults).

`standard` mode in Pyright is chosen because it enforces the following principles:

- **Catch real bugs early** - It prevents incorrect function calls, invalid attribute access, misuse of Optional values, inconsistent overloads, and a wide range of type errors that would otherwise only appear at runtime.

- **Maintain clarity without excessive annotation burden** - Developers are not expected to annotate every variable or build fully typed signatures for every function. Pyright uses inference aggressively, and `standard` mode focuses on correctness where types are known or inferred.

- **Work seamlessly with third-party libraries** - Many Python libraries ship without type stubs. In `standard` mode, these imports are treated as Any, allowing us to use them without blocking type checks while still preserving type safety inside our own code.

### Runtime Type Safety at System Boundaries

While Pyright provides excellent static type checking during development, **system boundaries** require additional runtime validation. These are points where our Python code interfaces with external systems, user input, or network requests where data types cannot be guaranteed at compile time.

In this project, we use **Pydantic** for rigorous runtime type checking at these critical handover points:

#### FastAPI Endpoints
All FastAPI route handlers use Pydantic models for request/response validation:
- Request bodies are validated against Pydantic schemas
- Query parameters and path parameters are type-checked at runtime
- Response models ensure consistent API contract enforcement
```python
# Example: API endpoint with Pydantic validation
from pydantic import BaseModel
from fastapi import FastAPI

class UserRequest(BaseModel):
name: str
age: int

@app.post("/users")
async def create_user(user: UserRequest):
# Pydantic validates name is string, age is int
# Invalid data raises 422 before reaching this code
return {"id": 1, "name": user.name}
```

This dual approach of **static type checking with Pyright** + **runtime validation with Pydantic** ensures both development-time correctness and production-time reliability at system boundaries where type safety cannot be statically guaranteed.

**Note: Type checks are only run on core source code and not on test-cases**

## Linter Rules

Consistent linting is essential for maintaining a reliable and scalable code-base. By adhering to a well-defined linter configuration, we ensure the code remains readable, secure, and predictable even as the project evolves.

The following set of rules are enabled in this repository. Linter rules are enforced automatically through the CI pipeline and must pass before merging changes into the `wip`, `dev`, or `main` branches.
.

Each category is summarized with a description and a link to the Ruff documentation explaining these rules.

### Selected Linter Rule Categories

#### E4, E7, E9 — Pycodestyle Error Rules

These check for fundamental correctness issues such as import formatting, indentation, and syntax problems that would otherwise cause runtime failures.

- **E4**: Import formatting and blank-line rules
(https://docs.astral.sh/ruff/rules/#pycodestyle-e4)

- **E7**: Indentation and tab-related issues
(https://docs.astral.sh/ruff/rules/#pycodestyle-e7)

- **E9**: Syntax errors and runtime error patterns (e.g., undefined names in certain contexts)
(https://docs.astral.sh/ruff/rules/#pycodestyle-e9)

#### F — Pyflakes

Static analysis rules that detect real bug patterns such as unused variables, unused imports, undefined names, duplicate definitions, and logical mistakes that can cause bugs.

(https://docs.astral.sh/ruff/rules/#pyflakes-f)

#### B — Flake8-Bugbear

A set of high-value checks for common Python pitfalls: mutable default arguments, improper exception handling, unsafe patterns, redundant checks, and subtle bugs that impact correctness and security.

(https://docs.astral.sh/ruff/rules/#flake8-bugbear-b)

#### T20 — Flake8-Print

Flags any usage of `print()` or `pprint()` in production code to prevent leaking sensitive information, mixing debug output into logs, or introducing uncontrolled console output.

(https://docs.astral.sh/ruff/rules/#flake8-print-t20)

#### N — PEP8-Naming

Ensures consistent and conventional naming across classes, functions, variables, and modules. This helps maintain readability across the engineering team and reinforces clarity in code reviews.

(https://docs.astral.sh/ruff/rules/#pep8-naming-n)

#### ANN — Flake8-Annotations

Enforces type annotation discipline across functions, methods, and class structures. With Pyright used for type checking, these rules ensure that type information remains explicit and complete.

(https://docs.astral.sh/ruff/rules/#flake8-annotations-ann)

#### ERA — Eradicate

Removes or flags commented-out code fragments. Commented code tends to accumulate over time and reduces clarity. The goal is to keep the repository clean and avoid keeping dead code in version control.

(https://docs.astral.sh/ruff/rules/#eradicate-era)

#### PERF — Perflint

Performance-oriented rules that highlight inefficient constructs, slow loops, unnecessary list or dict operations, and patterns that degrade runtime efficiency.

(https://docs.astral.sh/ruff/rules/#perflint-perf)

### Fixing Linting Issues

Linting issues should always be resolved manually.
We **strongly discourage** relying on autofixes using `ruff check --fix` for this repository.

Unlike `ruff format`, which performs safe and predictable code formatting, the linter's autofix mode can alter control flow, refactor logic, or rewrite expressions in ways that introduce unintended bugs.

All linter errors will have **rule-code** like `ANN204` for example.
You can use the command line command
```bash
ruff rule <rule-code> #for example: ANN204
```

to get an explanation on the rule code, why it's a problem and how you can fix it.

Human oversight is essential to ensure that any corrective changes maintain the intended behavior of the application. Contributors should review each reported linting issue, understand why it is flagged, and apply the appropriate fix by hand.

---

## Formatting Rules

This repository uses the **Ruff Formatter** for code formatting. Its behavior is deterministic, safe, and aligned with the [Black Code Style](https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html).

Formatting is enforced automatically through the CI pipeline and must pass before merging changes into the `wip`, `dev`, or `main` branches.

### Selected Formatting Behaviors

#### String Quote Style

All string literals are formatted using **double quotes**.
This preserves consistency across the codebase and avoids unnecessary formatting churn.

(https://docs.astral.sh/ruff/formatter/#quote-style)

#### Indentation Style

Indentation always uses **spaces, not tabs**.
This mirrors the formatting style adopted by Black and avoids ambiguity across editors and environments.

(https://docs.astral.sh/ruff/formatter/#indent-style)

#### Magic Trailing Commas

The formatter respects magic trailing commas, meaning:

- **Adding a trailing comma** in lists, dicts, tuples, or function calls will trigger multi-line formatting.
- **Removing a trailing comma** results in a more compact single-line layout where appropriate.

This produces stable diffs and predictable wrapping behavior.

(https://docs.astral.sh/ruff/formatter/#skip-magic-trailing-comma)

#### Automatic Line Ending Detection

Ruff automatically detects and preserves the correct line-ending style (LF or CRLF) based on the existing file.
This prevents accidental line-ending changes when multiple developers work on different systems.

(https://docs.astral.sh/ruff/formatter/#line-ending)

#### Docstring Code Blocks

The formatter **does not reformat** code blocks inside docstrings.
This ensures that examples, snippets, API usage patterns, and documentation content remain exactly as written, preventing unintended modifications to teaching material or markdown-style fenced blocks.

(https://docs.astral.sh/ruff/formatter/#docstring-code-format)

### Applying Formatting

Unlike lint autofixes, **formatting changes are safe by design**.
The formatter never changes logical behavior, control flow, or semantics. It only standardizes layout.

You can run formatting locally using:

```bash
uv run ruff format
```

All formatting issues must be resolved before creating a pull request or merging into protected branches.



---

### Important Notes
Expand Down
101 changes: 74 additions & 27 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ dependencies = [
"openai>=1.106.1",
"numpy>=2.3.2",
"pre-commit>=4.3.0",
"pyright>=1.1.404",
"pyright>=1.1.407",
"pytest>=8.4.1",
"pyyaml>=6.0.2",
"ruff>=0.12.12",
Expand All @@ -37,41 +37,88 @@ dependencies = [
"langfuse>=3.8.1",
]

[tool.ruff]
# Exclude a variety of commonly ignored directories.
exclude = [
".bzr",
".direnv",
".eggs",
".git",
".git-rewrite",
".hg",
".ipynb_checkpoints",
".mypy_cache",
".nox",
".pants.d",
".pyenv",
".pytest_cache",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
".vscode",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"site-packages",
"venv",
]

# Same as Black Formatter.
line-length = 88
indent-width = 4

# Set Python Version - 3.12
target-version = "py312"

fix = false


[tool.ruff.lint]


select = ["E4", "E7", "E9", "F", "B", "T20", "N", "ANN", "ERA", "PERF"]
ignore = []

# Allow fix for all enabled rules (when `--fix`) is provided.
fixable = ["ALL"]
unfixable = []


[tool.ruff.format]
# Like Black, use double quotes for strings.
quote-style = "double"

# Like Black, indent with spaces, rather than tabs.
indent-style = "space"

# Like Black, respect magic trailing commas.
skip-magic-trailing-comma = false

# Like Black, automatically detect the appropriate line ending.
line-ending = "auto"

docstring-code-format = false
docstring-code-line-length = "dynamic"



[tool.pyright]
# --- Environment & discovery ---
pythonVersion = "3.12.10" # Target Python semantics (pattern matching, typing features, stdlib types).
venvPath = "." # Where virtual envs live relative to repo root.
venv = ".venv" # The specific env name uv manages (uv sync creates .venv).

# --- What to analyze ---
include = ["src", "tests"] # Top-level packages & tests to check.
include = ["*"] # Top-level packages & tests to check.
exclude = [
"**/.venv", "**/__pycache__", "build", "dist", ".git",
".ruff_cache", ".mypy_cache"
".ruff_cache", ".mypy_cache", "tests/", "**/tests/"
]

# --- Global strictness ---
typeCheckingMode = "strict" # Enforce full strict mode repo-wide (see notes below).
useLibraryCodeForTypes = true # If a lib lacks stubs, inspect its code to infer types where possible.

# Make the most common "loose" mistakes fail fast in strict mode.
# You can tune these individually if you need a temporary carve-out.
reportMissingTypeStubs = "error" # Untyped third-party libs must have type info (stubs or inline).
reportUnknownVariableType = "error" # Vars with unknown/implicit Any are not allowed.
reportUnknownMemberType = "error" # Members on unknowns are not allowed.
reportUnknownArgumentType = "error" # Call arguments can't be unknown.
reportUnknownLambdaType = "error" # Lambda params must be typed in strict contexts.
reportImplicitOptional = "error" # T | None must be explicit; no silent Optional.
reportMissingTypeArgument = "error" # Generic types must specify their parameters.
reportIncompatibleVariableOverride = "error" # Subclass fields must type-refine correctly.
reportInvalidTypeVarUse = "error" # Catch misuse of TypeVar/variance.
reportUntypedFunctionDecorator = "error" # Decorators must be typed (prevents Any leakage).
reportUnusedVariable = "error" # Ditto; promote to "error" if you want hard hygiene.
reportUnusedImport = "warning" # Hygiene: warn, but don’t fail builds.


# Tests often deserialize lots of data and patch frameworks; keep them strict,
# but relax "missing stubs" so untyped test-only libs don’t block you.
[[tool.pyright.overrides]]
module = "tests/**"
reportMissingTypeStubs = "warning"
typeCheckingMode = "standard" # Standard typechecking mode
15 changes: 10 additions & 5 deletions src/llm_orchestration_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from prompt_refine_manager.prompt_refiner import PromptRefinerAgent
from src.response_generator.response_generate import ResponseGeneratorAgent
from src.response_generator.response_generate import stream_response_native
from src.vector_indexer.constants import ResponseGenerationConstants
from src.llm_orchestrator_config.llm_ochestrator_constants import (
OUT_OF_SCOPE_MESSAGE,
TECHNICAL_ISSUE_MESSAGE,
Expand Down Expand Up @@ -343,7 +344,7 @@ async def stream_orchestration_response(
].check_scope_quick(
question=refined_output.original_question,
chunks=relevant_chunks,
max_blocks=10,
max_blocks=ResponseGenerationConstants.DEFAULT_MAX_BLOCKS,
)
timing_dict["scope_check"] = time.time() - start_time

Expand Down Expand Up @@ -382,7 +383,7 @@ async def bot_response_generator() -> AsyncIterator[str]:
agent=components["response_generator"],
question=refined_output.original_question,
chunks=relevant_chunks,
max_blocks=10,
max_blocks=ResponseGenerationConstants.DEFAULT_MAX_BLOCKS,
):
yield token

Expand Down Expand Up @@ -1619,13 +1620,17 @@ def _format_chunks_for_test_response(
relevant_chunks: List of retrieved chunks with metadata

Returns:
List of ChunkInfo objects with rank and content, or None if no chunks
List of ChunkInfo objects with rank and content (limited to top 5), or None if no chunks
"""
if not relevant_chunks:
return None

# Limit to top-k chunks that are actually used in response generation
max_blocks = ResponseGenerationConstants.DEFAULT_MAX_BLOCKS
limited_chunks = relevant_chunks[:max_blocks]

formatted_chunks = []
for rank, chunk in enumerate(relevant_chunks, start=1):
for rank, chunk in enumerate(limited_chunks, start=1):
# Extract text content - prefer "text" key, fallback to "content"
chunk_text = chunk.get("text", chunk.get("content", ""))
if isinstance(chunk_text, str) and chunk_text.strip():
Expand Down Expand Up @@ -1682,7 +1687,7 @@ def _generate_rag_response(
generator_result = response_generator.forward(
question=refined_output.original_question,
chunks=relevant_chunks or [],
max_blocks=10,
max_blocks=ResponseGenerationConstants.DEFAULT_MAX_BLOCKS,
)

answer = (generator_result.get("answer") or "").strip()
Expand Down
Loading
Loading