Skip to content

Conversation

@yangcao77
Copy link
Contributor

@yangcao77 yangcao77 commented Sep 19, 2025

Description

This PR implements topic summary feature parity with road-core

it summarizes the initial query of a conversation session.

The topic summary also returned as part of the GET /v1/conversations endpoint.
Screenshot 2025-09-18 at 9 29 14 PM

also checks the GET /v2/conversations endpoint.

Screenshot 2025-09-29 at 3 26 54 PM

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

To test the changes:

  1. send queries to either /query or /streaming-query without any conversation_id
  2. call GET /v1/conversations to verify the conversations list contains topic_summary, which summarizes the initial purpose
  3. send queries to existing conversations created above, related to a different topic
  4. call GET /v1/conversations to verify the conversations topic summary does not change with follow up questions

Summary by CodeRabbit

  • New Features

    • Conversations now include automatic topic summaries and last-message timestamps; conversation lists return conversation objects (id, topic_summary, timestamp). Cache APIs support storing topic summaries.
  • Documentation

    • API schemas, examples, and docs updated across v1/v2 to show topic_summary and the new ConversationData shape.
  • Tests

    • Expanded unit tests for topic summary generation, endpoints (including streaming), cache backends, and utilities.
  • Configuration

    • New defaults for topic summarization and authentication behavior.

Signed-off-by: Stephanie <yangcao@redhat.com>
Signed-off-by: Stephanie <yangcao@redhat.com>
Signed-off-by: Stephanie <yangcao@redhat.com>
Signed-off-by: Stephanie <yangcao@redhat.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 22, 2025

Walkthrough

Adds conversation topic summaries end-to-end: new ConversationData model and OpenAPI updates; cache schema/SQL and cache APIs to store/read topic_summary and last_message_timestamp; endpoints compute/persist topic_summary via temporary agents; DB model and tests updated accordingly.

Changes

Cohort / File(s) Change summary
OpenAPI schema
docs/openapi.json
Added ConversationData and topic_summary; updated ConversationDetails, ConversationsListResponse and ConversationsListResponseV2 examples/descriptions to include topic_summary and timestamps.
Response & cache data models
src/models/responses.py, src/models/cache_entry.py
Added optional topic_summary to ConversationDetails; new ConversationData model (conversation_id, topic_summary, last_message_timestamp); V2 responses now use list[ConversationData].
Endpoint handlers
src/app/endpoints/query.py, src/app/endpoints/streaming_query.py, src/app/endpoints/conversations.py, src/app/endpoints/conversations_v2.py
Thread topic_summary through flows: compute via temp agent when needed, pass topic_summary into cache.store/persist_user_conversation_details, add skip_userid_check handling in v2 handlers and streaming endpoints.
Utilities / temp agent & helpers
src/utils/endpoints.py
Added get_topic_summary_system_prompt, get_temp_agent; extended store_conversation_into_cache to accept _skip_userid_check and topic_summary; adjusted helper returns.
Cache interface & implementations
src/cache/cache.py, src/cache/in_memory_cache.py, src/cache/noop_cache.py, src/cache/postgres_cache.py, src/cache/sqlite_cache.py
list(user_id, skip_userid_check) now returns list[ConversationData]; added set_topic_summary(user_id, conversation_id, topic_summary, skip_userid_check); SQL/schema updated (conversations table, upserts, deletes, ordering by last_message_timestamp).
Database model
src/models/database/conversations.py
Added topic_summary mapped column to UserConversation (default "").
Constants / config
src/constants.py
Added DEFAULT_TOPIC_SUMMARY_SYSTEM_PROMPT plus several auth-related default constants (skip-user-id flags, auth module and JWT claim defaults).
Tests — endpoints, utils, cache
tests/unit/app/endpoints/*, tests/unit/utils/test_endpoints.py, tests/unit/cache/*
Tests added/updated for get_topic_summary, get_temp_agent, topic_summary persistence/retrieval, cache.set_topic_summary behavior, DB session mocks, and ConversationData usage and ordering.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant API as API Server
  participant Query as query_endpoint_handler
  participant Utils as utils.endpoints
  participant TempAgent as Temp AsyncAgent
  participant Cache as Cache/DB

  User->>API: POST /query
  API->>Query: forward request
  alt conversation missing
    Query->>Utils: get_topic_summary_system_prompt(config)
    Query->>Utils: get_temp_agent(client, model_id, system_prompt)
    Utils-->>Query: agent, session_id, conversation_id
    Query->>TempAgent: run agent on question
    TempAgent-->>Query: reply content
    Query->>Query: extract topic_summary
    Query->>Cache: set_topic_summary(user, conversation_id, topic_summary, skip_userid_check)
    Query->>Query: persist_user_conversation_details(..., topic_summary)
  else conversation exists
    Query->>Query: persist_user_conversation_details(..., topic_summary=None)
  end
  Query-->>API: QueryResponse
  API-->>User: 200 OK
Loading
sequenceDiagram
  autonumber
  actor User
  participant SSE as API (SSE)
  participant Stream as streaming_query_handler
  participant Utils as utils.endpoints
  participant TempAgent as Temp AsyncAgent
  participant Cache as Cache/DB

  User->>SSE: POST /streaming_query (SSE)
  SSE->>Stream: init stream
  alt conversation missing
    Stream->>Utils: get_topic_summary_system_prompt(config)
    Stream->>Utils: get_temp_agent(client, model_id, system_prompt)
    Utils-->>Stream: agent, session_id, conversation_id
    Stream->>TempAgent: chat(question)
    TempAgent-->>Stream: streamed tokens
    Stream->>Stream: derive topic_summary
    Stream->>Cache: set_topic_summary(user, conversation_id, topic_summary, skip_userid_check)
    Stream->>Stream: persist_user_conversation_details(..., topic_summary)
  else conversation exists
    Stream->>Stream: persist_user_conversation_details(..., topic_summary=None)
  end
  Stream-->>User: streamed SSE events
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • umago
  • tisnik

Poem

A rabbit hopped in with a curious hop,
I summoned a temp agent to fetch a short crop.
I tucked that small tale by each chat's side,
Now every conversation has a tiny guide. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “[RHDHPAI-978] Topic summary of initial query” concisely and accurately reflects the primary change in the pull request, namely the addition of a topic summary feature for initial queries, and it aligns directly with the stated objectives and scope of the diff.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

  • Public repositories are always opted into early access features.
  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/models/database/conversations.py (1)

38-39: Make topic_summary nullable to match API Optional[str] and avoid NOT NULL violations.

DB column is non‑nullable with default "", but API models expose Optional[str]. Passing None (or future migrations) will break at commit time.

Apply this diff:

+from typing import Optional
@@
-    topic_summary: Mapped[str] = mapped_column(default="")
+    topic_summary: Mapped[Optional[str]] = mapped_column(nullable=True, default=None)

If you intentionally want empty string semantics, normalize at the API boundary and keep the DB nullable to be safe.

src/constants.py (1)

92-101: Invalid DEFAULT_USER_UID (not a valid UUID format)

Value "00000000-0000-0000-0000-000" is too short and will fail UUID parsers/validators.

-DEFAULT_USER_UID = "00000000-0000-0000-0000-000"
+DEFAULT_USER_UID = "00000000-0000-0000-0000-000000000000"
🧹 Nitpick comments (16)
src/utils/endpoints.py (1)

209-239: Fix docstring to reflect actual return values and order.

Function returns (agent, session_id, conversation_id) but the docstring claims a 2‑tuple and omits ordering. Update the docstring now to avoid misuse.

Apply this diff:

@@
-async def get_temp_agent(
+async def get_temp_agent(
@@
-) -> tuple[AsyncAgent, str, str]:
+) -> tuple[AsyncAgent, str, str]:
@@
-    Returns:
-        tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
+    Returns:
+        tuple[AsyncAgent, str, str]: A tuple containing:
+            - agent (AsyncAgent)
+            - session_id (str)
+            - conversation_id (str)
tests/unit/app/endpoints/test_streaming_query.py (1)

293-298: Deduplicate get_topic_summary patching via an autouse fixture.

Same patch appears in multiple tests; consolidate to reduce noise.

Add once in this module:

import pytest

@pytest.fixture(autouse=True)
def _patch_get_topic_summary(mocker):
    patch = mocker.patch(
        "app.endpoints.streaming_query.get_topic_summary",
        return_value="Test topic summary",
    )
    yield patch

Then remove the repeated per-test patches in the highlighted ranges.

Also applies to: 1324-1329, 1375-1379, 1427-1431

tests/unit/app/endpoints/test_query.py (1)

193-197: DRY: consolidate repeated get_topic_summary patching into a fixture

You patch app.endpoints.query.get_topic_summary in several places. Consider a module-scoped pytest fixture to reduce duplication and ensure consistent teardown.

+@pytest.fixture(autouse=True)
+def _patch_topic_summary(mocker):
+    yield mocker.patch("app.endpoints.query.get_topic_summary", return_value="Test topic summary")

Also applies to: 1224-1227, 1279-1281, 1334-1336

src/constants.py (1)

31-90: Default topic summary prompt added — LGTM

Clear constraints and examples. Minor copy nits optional: “interpunction marks” → “punctuation marks”; “ExampleInput” → “Example Input”.

- - Exclude all punctuation and interpunction marks (e.g., . , : ; ! ? "")
+ - Exclude punctuation marks (e.g., . , : ; ! ? "")
- ExampleInput:
+ Example Input:
src/app/endpoints/query.py (4)

6-6: Prefer modern union syntax consistently

Use str | None over Optional[str] to match the rest of the module’s typing style.

-from typing import Annotated, Any, cast, Optional
+from typing import Annotated, Any, cast

…and update annotations:

-    topic_summary: Optional[str],
+    topic_summary: str | None,

47-47: Logger name should use module-qualified logger

Follow guideline: use logging.getLogger(name) for module logging.

-logger = logging.getLogger("app.endpoints.handlers")
+logger = logging.getLogger(__name__)

82-88: persist_user_conversation_details: docstring/typing polish

  • Add Args/Returns to docstring (Google style).
  • Rename model to model_id for clarity.
  • Keep topic_summary immutable on updates — good.
-def persist_user_conversation_details(
-    user_id: str,
-    conversation_id: str,
-    model: str,
-    provider_id: str,
-    topic_summary: str | None,
-) -> None:
+def persist_user_conversation_details(
+    user_id: str,
+    conversation_id: str,
+    model_id: str,
+    provider_id: str,
+    topic_summary: str | None,
+) -> None:
@@
-                last_used_model=model,
+                last_used_model=model_id,
@@
-            existing_conversation.last_used_model = model
+            existing_conversation.last_used_model = model_id

Update callers accordingly.

Also applies to: 95-114


159-189: Optional: enforce output constraints server-side

To guarantee “<= 5 words” even if the model drifts, post-process the result (strip punctuation, cap to 5 words).

-    return (
-        interleaved_content_as_str(response.output_message.content)
+    raw = (
+        interleaved_content_as_str(response.output_message.content)
         if (
             getattr(response, "output_message", None) is not None
             and getattr(response.output_message, "content", None) is not None
         )
         else ""
     )
+    words = [w for w in raw.replace("-", " ").split() if w]
+    return " ".join(words[:5])
docs/openapi.json (2)

1828-1843: exclusiveMinimum type nit

exclusiveMinimum is specified as 0.0 for an integer property. Prefer 0 (integer) for cleaner JSON Schema.

-                        "exclusiveMinimum": 0.0,
+                        "exclusiveMinimum": 0,

1346-1368: topic_summary field addition is accurate

Schema and examples reflect the new topic_summary surface. Minor copy nit: “Openshift” → “OpenShift”.

-                            "Openshift Microservices Deployment Strategies"
+                            "OpenShift Microservices Deployment Strategies"

Also applies to: 1428-1448

src/models/responses.py (3)

489-490: Clarify “topic_summary” semantics and fix example commas.

  • Doc: make it explicit this summarizes the initial user query and is stable across follow‑ups (per PR objective).
  • Example: missing commas after conversation_id lines make the snippet invalid Python.

Apply:

@@
-        topic_summary: The topic summary for the conversation.
+        topic_summary: Short summary of the initial user query for this conversation (does not change with follow‑up turns).
@@
-        conversation = ConversationDetails(
-            conversation_id="123e4567-e89b-12d3-a456-426614174000"
+        conversation = ConversationDetails(
+            conversation_id="123e4567-e89b-12d3-a456-426614174000",
             created_at="2024-01-01T00:00:00Z",
             last_message_at="2024-01-01T00:05:00Z",
             message_count=5,
             last_used_model="gemini/gemini-2.0-flash",
             last_used_provider="gemini",
             topic_summary="Openshift Microservices Deployment Strategies",
         )

Also applies to: 493-502


541-546: LGTM. Consider length/whitespace constraints (optional).

Field looks correct. If you want to keep list views tidy and avoid accidental whitespace-only values, consider:

-    topic_summary: Optional[str] = Field(
-        None,
-        description="Topic summary for the conversation",
-        examples=["Openshift Microservices Deployment Strategies"],
-    )
+    topic_summary: Optional[str] = Field(
+        None,
+        description="Topic summary for the conversation",
+        examples=["Openshift Microservices Deployment Strategies"],
+        max_length=200,
+    )

(If you add trimming/normalization, keep empty-string behavior as-is to satisfy tests.)


565-575: Examples updated correctly; fix a small typo (missing comma).

The additions are consistent, but there’s a missing comma after conversation_id in the second ConversationDetails example.

-                ConversationDetails(
-                    conversation_id="456e7890-e12b-34d5-a678-901234567890"
+                ConversationDetails(
+                    conversation_id="456e7890-e12b-34d5-a678-901234567890",
                     created_at="2024-01-01T01:00:00Z",

Also applies to: 595-604, 568-569

tests/unit/app/endpoints/test_conversations.py (3)

51-52: Type‑hint the helper for readability and guideline consistency.

Add parameter and return type hints to the test helper.

-def create_mock_conversation(
-    mocker,
-    conversation_id,
-    created_at,
-    last_message_at,
-    message_count,
-    last_used_model,
-    last_used_provider,
-    topic_summary=None,
-):  # pylint: disable=too-many-arguments,too-many-positional-arguments
+def create_mock_conversation(
+    mocker: Any,
+    conversation_id: str,
+    created_at: str,
+    last_message_at: str,
+    message_count: int,
+    last_used_model: str,
+    last_used_provider: str,
+    topic_summary: str | None = None,
+) -> Any:  # pylint: disable=too-many-arguments,too-many-positional-arguments
@@
-    mock_conversation.topic_summary = topic_summary
+    mock_conversation.topic_summary = topic_summary

Also applies to: 63-64


746-809: Consider parametrizing to reduce duplication.

The “mixed”, “empty string”, and “None” cases share setup/verification patterns. You could parametrize inputs/expectations to cut repetition and speed up maintenance. Optional:

+@pytest.mark.asyncio
+@pytest.mark.parametrize(
+    "topic_summary, assert_fn",
+    [
+        ("OpenStack deployment strategies", lambda c: c.topic_summary == "OpenStack deployment strategies"),
+        (None, lambda c: c.topic_summary is None),
+        ("", lambda c: c.topic_summary == ""),
+    ],
+)
+async def test_conversations_list_topic_summary_variants(mocker, setup_configuration, dummy_request, topic_summary, assert_fn):
+    mock_authorization_resolvers(mocker)
+    mocker.patch("app.endpoints.conversations.configuration", setup_configuration)
+    mock_conversations = [
+        create_mock_conversation(
+            mocker,
+            "123e4567-e89b-12d3-a456-426614174000",
+            "2024-01-01T00:00:00Z",
+            "2024-01-01T00:05:00Z",
+            5,
+            "gemini/gemini-2.0-flash",
+            "gemini",
+            topic_summary,
+        ),
+    ]
+    mock_database_session(mocker, mock_conversations)
+    resp = await get_conversations_list_endpoint_handler(auth=MOCK_AUTH, request=dummy_request)
+    conv = resp.conversations[0]
+    assert assert_fn(conv)

Also applies to: 811-843


845-884: Add one more serialization assertion at top level (optional).

You already assert conv.model_dump(). To catch regressions in the envelope response, also assert on ConversationsListResponse.model_dump():

 response = await get_conversations_list_endpoint_handler(
     auth=MOCK_AUTH, request=dummy_request
 )
@@
 conv_dict = conv.model_dump()
 assert "topic_summary" in conv_dict
 assert conv_dict["topic_summary"] == "Test topic summary"
+
+# Also validate top-level serialization
+resp_dump = response.model_dump()
+assert "conversations" in resp_dump
+assert "topic_summary" in resp_dump["conversations"][0]
+assert resp_dump["conversations"][0]["topic_summary"] == "Test topic summary"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ca1832 and fdf269d.

📒 Files selected for processing (12)
  • docs/openapi.json (6 hunks)
  • src/app/endpoints/conversations.py (1 hunks)
  • src/app/endpoints/query.py (7 hunks)
  • src/app/endpoints/streaming_query.py (2 hunks)
  • src/constants.py (1 hunks)
  • src/models/database/conversations.py (1 hunks)
  • src/models/responses.py (5 hunks)
  • src/utils/endpoints.py (2 hunks)
  • tests/unit/app/endpoints/test_conversations.py (6 hunks)
  • tests/unit/app/endpoints/test_query.py (7 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (5 hunks)
  • tests/unit/utils/test_endpoints.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/models/database/conversations.py
  • src/app/endpoints/conversations.py
  • src/models/responses.py
  • src/utils/endpoints.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/constants.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/models/database/conversations.py
  • src/app/endpoints/conversations.py
  • tests/unit/app/endpoints/test_streaming_query.py
  • src/models/responses.py
  • src/utils/endpoints.py
  • tests/unit/utils/test_endpoints.py
  • tests/unit/app/endpoints/test_conversations.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • tests/unit/app/endpoints/test_query.py
  • src/constants.py
src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

  • src/models/database/conversations.py
  • src/models/responses.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/database/conversations.py
  • src/models/responses.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/conversations.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/conversations.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/conversations.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/app/endpoints/test_streaming_query.py
  • tests/unit/utils/test_endpoints.py
  • tests/unit/app/endpoints/test_conversations.py
  • tests/unit/app/endpoints/test_query.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/app/endpoints/test_streaming_query.py
  • tests/unit/utils/test_endpoints.py
  • tests/unit/app/endpoints/test_conversations.py
  • tests/unit/app/endpoints/test_query.py
src/constants.py

📄 CodeRabbit inference engine (CLAUDE.md)

Keep shared constants in a central src/constants.py with descriptive comments

Files:

  • src/constants.py
🧠 Learnings (3)
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
PR: lightspeed-core/lightspeed-stack#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/client.py : Use Llama Stack client import: from llama_stack_client import AsyncLlamaStackClient

Applied to files:

  • src/app/endpoints/query.py
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
PR: lightspeed-core/lightspeed-stack#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/client.py : Handle Llama Stack APIConnectionError when interacting with the Llama client

Applied to files:

  • src/app/endpoints/query.py
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
PR: lightspeed-core/lightspeed-stack#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/app/**/*.py : Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Applied to files:

  • src/app/endpoints/streaming_query.py
🧬 Code graph analysis (6)
src/utils/endpoints.py (3)
src/models/config.py (2)
  • config (138-144)
  • get_prompts (442-444)
src/configuration.py (1)
  • customization (119-123)
src/utils/suid.py (1)
  • get_suid (6-12)
tests/unit/utils/test_endpoints.py (3)
src/utils/endpoints.py (3)
  • get_agent (153-206)
  • get_temp_agent (209-239)
  • get_topic_summary_system_prompt (115-126)
src/configuration.py (3)
  • AppConfig (36-153)
  • init_from_dict (60-62)
  • customization (119-123)
src/models/config.py (2)
  • CustomProfile (425-444)
  • get_prompts (442-444)
tests/unit/app/endpoints/test_conversations.py (4)
tests/unit/app/endpoints/test_query.py (1)
  • dummy_request (39-48)
tests/unit/utils/auth_helpers.py (1)
  • mock_authorization_resolvers (9-27)
src/app/endpoints/conversations.py (1)
  • get_conversations_list_endpoint_handler (181-238)
src/models/responses.py (1)
  • ConversationsListResponse (548-609)
src/app/endpoints/query.py (3)
src/utils/endpoints.py (2)
  • get_topic_summary_system_prompt (115-126)
  • get_temp_agent (209-239)
src/app/database.py (1)
  • get_session (34-40)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/app/endpoints/streaming_query.py (3)
src/app/database.py (1)
  • get_session (34-40)
src/app/endpoints/query.py (2)
  • persist_user_conversation_details (82-114)
  • get_topic_summary (159-189)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
tests/unit/app/endpoints/test_query.py (1)
src/app/endpoints/query.py (1)
  • get_topic_summary (159-189)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (20)
src/utils/endpoints.py (1)

115-127: Topic summary prompt precedence looks good.

Profile-first fallback to default is clear and matches tests. LGTM.

src/app/endpoints/conversations.py (1)

217-217: Confirm desired semantics for empty topic summaries.

You pass through conv.topic_summary verbatim. Do we want "" or null in the API when there’s no summary? Current DB default is "", while the schema allows Optional[str].

If you prefer nulls, normalize here:

-                    topic_summary=conv.topic_summary,
+                    topic_summary=(conv.topic_summary or None),

I can update tests accordingly if you choose this behavior.

tests/unit/app/endpoints/test_streaming_query.py (1)

61-67: Context-managed DB session mock is correct.

The enter/exit setup aligns with with get_session() usage. LGTM.

src/app/endpoints/streaming_query.py (1)

23-35: Imports and wiring look good.

Using standard FastAPI imports and explicit utilities. LGTM.

tests/unit/utils/test_endpoints.py (2)

660-760: Solid coverage for get_temp_agent behavior.

Asserts on persistence flag, init and session creation are on point. LGTM.


784-844: Topic-summary prompt selection tests are comprehensive.

Default, custom profile present/absent paths are exercised. LGTM.

tests/unit/app/endpoints/test_query.py (7)

26-27: Import exposure for get_topic_summary looks good

The tests correctly import the newly exposed get_topic_summary from app.endpoints.query.


66-72: DB session mocking pattern is solid

Context-managing get_session with enter/exit mocks prevents accidental real DB usage during tests.


1616-1670: Coverage for happy path is comprehensive

The successful-response test validates prompt retrieval, temp agent creation, create_turn args, and content extraction. Nice.


1672-1744: Edge-cases covered (empty/None content)

Good checks that empty output_message or None content yield "".


1792-1835: Assert system prompt selection uses configuration

Verifying get_topic_summary_system_prompt is called with the configuration is helpful to guard regressions.


1884-1934: Nice verification of create_turn parameters

Asserting message payload and session usage builds confidence in the integration.


1837-1882: Ignore model‐format change for get_topic_summary
get_topic_summary is defined to accept and forward the raw model_id and isn’t invoked by query_endpoint_handler, so the existing tests asserting calls with "custom_model" are correct.

Likely an incorrect or invalid review comment.

src/constants.py (1)

92-115: Auth constants: sensible defaults

NO_USER_TOKEN, module IDs, and claim defaults are fine and centralized.

src/app/endpoints/query.py (3)

274-284: Topic summary generation gated to first turn — good

Only generating when the conversation is new matches the requirement that topic_summary remains stable across follow-ups.


302-308: Propagating topic_summary to persistence — good

The flow passes the computed summary to persistence on creation.


159-189: Use combined llama_stack_model_id when creating the temp agent for topic summary

get_temp_agent is called with model_id in src/app/endpoints/query.py (around lines 170–174); pass the combined llama_stack_model_id (as retrieve_response does) to avoid provider-scoping resolution failures. Update callers that call get_topic_summary to pass llama_stack_model_id instead of model_id, and change the get_topic_summary docstring param text from “validated” to “summarized”.

docs/openapi.json (1)

1158-1161: Conversation cache schemas look consistent

New ConversationCache and InMemoryCacheConfig schemas are well-structured with additionalProperties: false.

Also applies to: 1174-1227

tests/unit/app/endpoints/test_conversations.py (2)

632-643: Coverage for topic_summary in happy path looks good.

Asserts verify propagation and serialization shape via pydantic models; no issues spotted.

Also applies to: 653-673


712-744: None topic_summary case covered well.

Good check for None passthrough; aligns with Optional[str] contract.

Comment on lines 704 to 715
# Get the initial topic summary for the conversation

topic_summary = None
with get_session() as session:
existing_conversation = (
session.query(UserConversation).filter_by(id=conversation_id).first()
)
if not existing_conversation:
topic_summary = await get_topic_summary(
query_request.query, client, model_id
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Harden topic summary computation and ensure correct model id is used.

  • A failure in get_topic_summary will currently fail the request; handle errors and proceed.
  • Verify we should pass llama_stack_model_id (agent model identifier) instead of model_id here.

Apply this diff:

-        topic_summary = None
+        topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
-                topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
-                )
+                try:
+                    # Use the actual agent model identifier for temp agent creation
+                    topic_summary = await get_topic_summary(
+                        query_request.query, client, llama_stack_model_id
+                    )
+                except Exception:  # be resilient; summary is best-effort
+                    logger.exception("Failed to compute topic summary for %s", conversation_id)
+                    topic_summary = None

Also applies to: 721-721


🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "PWD: $(pwd)"

# Try to locate streaming_query.py
FILE=$(rg --hidden --glob '!.git' --glob '!**/node_modules/**' --files 2>/dev/null | rg 'streaming_query.py' || true)
if [ -z "$FILE" ]; then
  FILE=$(find . -type f -name 'streaming_query.py' 2>/dev/null | head -n1 || true)
fi

if [ -z "$FILE" ]; then
  echo "streaming_query.py not found in repo root."
  echo "--- Searching repo for get_topic_summary and llama_stack_model_id ---"
  rg --hidden --glob '!.git' --glob '!**/node_modules/**' -n "get_topic_summary" || true
  rg --hidden --glob '!.git' --glob '!**/node_modules/**' -n "llama_stack_model_id" || true
  exit 0
fi

echo "Found: $FILE"
echo "---- printing lines 680-740 (if present) ----"
sed -n '680,740p' "$FILE" || true

echo "---- occurrences of get_topic_summary ----"
rg -n "get_topic_summary" || true

echo "---- show definition of get_topic_summary (if present) ----"
rg -n "def get_topic_summary" || rg -n "async def get_topic_summary" || true

echo "---- occurrences of llama_stack_model_id ----"
rg -n "llama_stack_model_id" || true

echo "---- occurrences of model_id in the file ----"
rg -n "model_id" "$FILE" || true

Length of output: 11848


Harden topic summary computation and use llama_stack_model_id

  • Make topic summary best-effort (don’t fail the request if get_topic_summary errors).
  • Pass llama_stack_model_id (provider/model) to get_topic_summary instead of model_id.
-        topic_summary = None
+        topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
-                topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
-                )
+                try:
+                    # Use the actual agent model identifier for temp agent creation
+                    topic_summary = await get_topic_summary(
+                        query_request.query, client, llama_stack_model_id
+                    )
+                except Exception:  # be resilient; summary is best-effort
+                    logger.exception("Failed to compute topic summary for %s", conversation_id)
+                    topic_summary = None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Get the initial topic summary for the conversation
topic_summary = None
with get_session() as session:
existing_conversation = (
session.query(UserConversation).filter_by(id=conversation_id).first()
)
if not existing_conversation:
topic_summary = await get_topic_summary(
query_request.query, client, model_id
)
# Get the initial topic summary for the conversation
topic_summary = None
with get_session() as session:
existing_conversation = (
session.query(UserConversation).filter_by(id=conversation_id).first()
)
if not existing_conversation:
try:
# Use the actual agent model identifier for temp agent creation
topic_summary = await get_topic_summary(
query_request.query, client, llama_stack_model_id
)
except Exception: # be resilient; summary is best-effort
logger.exception("Failed to compute topic summary for %s", conversation_id)
topic_summary = None
🤖 Prompt for AI Agents
In src/app/endpoints/streaming_query.py around lines 704 to 715, the topic
summary call should be made best-effort and pass the provider/model identifier
named llama_stack_model_id instead of model_id; update the code so that if no
existing_conversation you attempt to set topic_summary by calling
get_topic_summary(query_request.query, client, llama_stack_model_id) inside a
try/except that catches exceptions, logs the error, and leaves topic_summary as
None on failure (do not raise or abort the request), preserving existing flow.

Comment on lines +209 to +239
async def get_temp_agent(
client: AsyncLlamaStackClient,
model_id: str,
system_prompt: str,
) -> tuple[AsyncAgent, str, str]:
"""Create a temporary agent with new agent_id and session_id.
This function creates a new agent without persistence, shields, or tools.
Useful for temporary operations or one-off queries, such as validating a
question or generating a summary.
Args:
client: The AsyncLlamaStackClient to use for the request.
model_id: The ID of the model to use.
system_prompt: The system prompt/instructions for the agent.
Returns:
tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
"""
logger.debug("Creating temporary agent")
agent = AsyncAgent(
client, # type: ignore[arg-type]
model=model_id,
instructions=system_prompt,
enable_session_persistence=False, # Temporary agent doesn't need persistence
)
await agent.initialize()

# Generate new IDs for the temporary agent
conversation_id = agent.agent_id
session_id = await agent.create_session(get_suid())

return agent, session_id, conversation_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Unify tuple semantics with get_agent or return a named result to prevent call‑site mistakes.

get_agent returns (agent, conversation_id, session_id) while get_temp_agent returns (agent, session_id, conversation_id). This inversion is easy to misuse.

Consider returning a named tuple to make ordering explicit:

+from typing import NamedTuple
+
+class TempAgentResult(NamedTuple):
+    agent: AsyncAgent
+    session_id: str
+    conversation_id: str
@@
-async def get_temp_agent(...) -> tuple[AsyncAgent, str, str]:
+async def get_temp_agent(...) -> TempAgentResult:
@@
-    return agent, session_id, conversation_id
+    return TempAgentResult(agent, session_id, conversation_id)

Alternatively, swap to (agent, conversation_id, session_id) to match get_agent and adjust call sites/tests accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def get_temp_agent(
client: AsyncLlamaStackClient,
model_id: str,
system_prompt: str,
) -> tuple[AsyncAgent, str, str]:
"""Create a temporary agent with new agent_id and session_id.
This function creates a new agent without persistence, shields, or tools.
Useful for temporary operations or one-off queries, such as validating a
question or generating a summary.
Args:
client: The AsyncLlamaStackClient to use for the request.
model_id: The ID of the model to use.
system_prompt: The system prompt/instructions for the agent.
Returns:
tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
"""
logger.debug("Creating temporary agent")
agent = AsyncAgent(
client, # type: ignore[arg-type]
model=model_id,
instructions=system_prompt,
enable_session_persistence=False, # Temporary agent doesn't need persistence
)
await agent.initialize()
# Generate new IDs for the temporary agent
conversation_id = agent.agent_id
session_id = await agent.create_session(get_suid())
return agent, session_id, conversation_id
from typing import NamedTuple
class TempAgentResult(NamedTuple):
agent: AsyncAgent
session_id: str
conversation_id: str
async def get_temp_agent(
client: AsyncLlamaStackClient,
model_id: str,
system_prompt: str,
) -> TempAgentResult:
"""Create a temporary agent with new agent_id and session_id.
This function creates a new agent without persistence, shields, or tools.
Useful for temporary operations or one-off queries, such as validating a
question or generating a summary.
Args:
client: The AsyncLlamaStackClient to use for the request.
model_id: The ID of the model to use.
system_prompt: The system prompt/instructions for the agent.
Returns:
tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
"""
logger.debug("Creating temporary agent")
agent = AsyncAgent(
client, # type: ignore[arg-type]
model=model_id,
instructions=system_prompt,
enable_session_persistence=False, # Temporary agent doesn't need persistence
)
await agent.initialize()
# Generate new IDs for the temporary agent
conversation_id = agent.agent_id
session_id = await agent.create_session(get_suid())
return TempAgentResult(agent, session_id, conversation_id)
🤖 Prompt for AI Agents
In src/utils/endpoints.py around lines 209 to 239 the function get_temp_agent
returns a tuple in the order (agent, session_id, conversation_id) which is
inverted relative to get_agent's (agent, conversation_id, session_id) and risks
call-site mistakes; change get_temp_agent to return (agent, conversation_id,
session_id) to match get_agent, update the function signature/docstring to
reflect that order, and adjust any call sites and tests accordingly (or
alternatively define and return a named tuple/dataclass with fields agent,
conversation_id, session_id and update docstrings and call sites to use those
names).

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/app/endpoints/query.py (1)

292-305: Prevent topic summary generation from breaking initial queries

Two problems here combine into a hard failure path:

  1. select_model_and_provider_id() returns both the provider-scoped llama_stack_model_id and the plain model_id. get_temp_agent() expects the provider-prefixed value (it’s the same one we pass to get_agent), so sending only the bare model label (model_id) causes agent initialization to fail for every model whose identifier includes a provider.
  2. Any exception thrown while we build the topic summary currently bubbles up and aborts the whole /query call, even though we already have the main answer. Topic summarization needs to be best-effort.

Together this means that the very first query on a fresh conversation will 500 as soon as the summary call runs.

Please switch to the provider-scoped identifier and make the summary generation resilient so the core response still goes out even if summarization cannot be produced.

-        # Get the initial topic summary for the conversation
-        topic_summary = None
-        with get_session() as session:
-            existing_conversation = (
-                session.query(UserConversation).filter_by(id=conversation_id).first()
-            )
-            if not existing_conversation:
-                topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
-                )
+        # Get the initial topic summary for brand-new conversations
+        with get_session() as session:
+            existing_conversation = (
+                session.query(UserConversation).filter_by(id=conversation_id).first()
+            )
+
+        topic_summary = ""
+        if existing_conversation is None:
+            try:
+                topic_summary = await get_topic_summary(
+                    query_request.query,
+                    client,
+                    llama_stack_model_id,
+                )
+            except Exception as exc:  # best-effort: keep the main response flowing
+                logger.warning(
+                    "Topic summary generation failed for conversation %s: %s",
+                    conversation_id,
+                    exc,
+                    exc_info=True,
+                )
tests/unit/app/endpoints/test_query.py (1)

66-79: Fix mocked session to support async with usage.

query_endpoint_handler now enters the session via async with get_session() as session. The mocked session currently only stubs __enter__/__exit__, so the tests will explode with AttributeError: __aenter__ once they hit that branch. Please extend the mock to implement __aenter__/__aexit__ (returning the session mock) so it behaves like an async context manager.

-    mock_session.__enter__ = mocker.Mock(return_value=mock_session)
-    mock_session.__exit__ = mocker.Mock(return_value=None)
+    mock_session.__aenter__ = mocker.AsyncMock(return_value=mock_session)
+    mock_session.__aexit__ = mocker.AsyncMock(return_value=None)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fdf269d and 2d22af0.

📒 Files selected for processing (3)
  • src/app/endpoints/query.py (6 hunks)
  • src/models/responses.py (5 hunks)
  • tests/unit/app/endpoints/test_query.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/models/responses.py
🧰 Additional context used
📓 Path-based instructions (7)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/query.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/query.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/app/endpoints/query.py
  • tests/unit/app/endpoints/test_query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/query.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/app/endpoints/test_query.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/app/endpoints/test_query.py
🧬 Code graph analysis (2)
src/app/endpoints/query.py (4)
src/utils/endpoints.py (2)
  • get_topic_summary_system_prompt (115-126)
  • get_temp_agent (209-239)
src/configuration.py (1)
  • configuration (65-69)
src/app/database.py (1)
  • get_session (34-40)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
tests/unit/app/endpoints/test_query.py (1)
src/app/endpoints/query.py (1)
  • get_topic_summary (176-206)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-pr

Comment on lines +201 to +204
# Mock get_topic_summary function
mocker.patch(
"app.endpoints.query.get_topic_summary", return_value="Test topic summary"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Patch async dependencies with AsyncMock.

get_topic_summary is awaited inside query_endpoint_handler. Patching it with the default MagicMock makes the await blow up (TypeError: object MagicMock can't be used in 'await' expression). Please patch it as an AsyncMock (or let mocker.patch(..., new_callable=mocker.AsyncMock) handle it) and set return_value there.

-    mocker.patch(
-        "app.endpoints.query.get_topic_summary", return_value="Test topic summary"
-    )
+    mocker.patch(
+        "app.endpoints.query.get_topic_summary",
+        new_callable=mocker.AsyncMock,
+        return_value="Test topic summary",
+    )

Apply the same fix to the other occurrences in this module.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Mock get_topic_summary function
mocker.patch(
"app.endpoints.query.get_topic_summary", return_value="Test topic summary"
)
# Mock get_topic_summary function
mocker.patch(
"app.endpoints.query.get_topic_summary",
new_callable=mocker.AsyncMock,
return_value="Test topic summary",
)
🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_query.py around lines 201 to 204, the patch for
get_topic_summary uses a regular MagicMock which cannot be awaited; change the
patch to use an AsyncMock (e.g.
mocker.patch("app.endpoints.query.get_topic_summary",
new_callable=mocker.AsyncMock, return_value="Test topic summary") or
mocker.patch("app.endpoints.query.get_topic_summary",
new=mocker.AsyncMock(return_value="Test topic summary")) so the mocked function
is awaitable and returns the expected value; apply the same replacement (use
mocker.AsyncMock via new or new_callable and set return_value) to all other
patched async functions in this test module.

Comment on lines +1349 to +1352
# Mock get_topic_summary function
mocker.patch(
"app.endpoints.query.get_topic_summary", return_value="Test topic summary"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Make all get_topic_summary patches awaitable.

Same problem here—query_endpoint_handler awaits get_topic_summary. Patching with a plain mock causes the await to fail. Wrap these patches with new_callable=mocker.AsyncMock and keep the return_value="Test topic summary" so the await resolves correctly.

🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_query.py around lines 1349 to 1352, the patch
for app.endpoints.query.get_topic_summary is a regular mock but the production
code awaits this function; change the patch to use new_callable=mocker.AsyncMock
while preserving return_value="Test topic summary" so the awaitable resolves
correctly (i.e., replace the plain patch with one that passes
new_callable=mocker.AsyncMock, return_value="Test topic summary").

Comment on lines +1405 to +1407
mocker.patch(
"app.endpoints.query.get_topic_summary", return_value="Test topic summary"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure mocked get_topic_summary is async.

As above, this patch must yield an AsyncMock to match the awaited coroutine usage.

🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_query.py around lines 1405-1407, the mocked
get_topic_summary is currently a regular Mock returning a string but the
production code awaits it; replace the patch to provide an AsyncMock that
returns "Test topic summary" (e.g., use
mocker.patch("app.endpoints.query.get_topic_summary",
new=AsyncMock(return_value="Test topic summary")) or mocker.patch(...,
new_callable=AsyncMock, return_value="Test topic summary")) so the awaited
coroutine behaves correctly.

Comment on lines +1461 to +1463
mocker.patch(
"app.endpoints.query.get_topic_summary", return_value="Test topic summary"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use AsyncMock when patching awaited call.

Same fix needed here: patch get_topic_summary with an AsyncMock so awaiting it in the handler succeeds.

🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_query.py around lines 1461 to 1463, the patched
get_topic_summary is a regular mock but the code under test awaits it; replace
the patch to use an AsyncMock (e.g., mocker.patch(..., new_callable=AsyncMock,
return_value="Test topic summary") or patch it with AsyncMock instance) so the
awaited call behaves correctly and returns the expected string.

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/app/endpoints/query.py (1)

361-377: Critical: AnyUrl is not callable; this block is unused and can crash the request.

AnyUrl is a pydantic type, not a constructor; calling AnyUrl(...) will raise TypeError. Since referenced_documents already comes from retrieve_response, drop this block to prevent runtime errors and duplicate work.

-        logger.info("Extracting referenced documents...")
-        referenced_docs = []
-        doc_sources = set()
-        for chunk in summary.rag_chunks:
-            if chunk.source and chunk.source not in doc_sources:
-                doc_sources.add(chunk.source)
-                referenced_docs.append(
-                    ReferencedDocument(
-                        doc_url=(
-                            AnyUrl(chunk.source)
-                            if chunk.source.startswith("http")
-                            else None
-                        ),
-                        doc_title=chunk.source,
-                    )
-                )
+        # Referenced documents are parsed in retrieve_response(); reuse that list.

If you prefer to keep chunk-based extraction, pass the URL string and let Pydantic coerce it instead of calling AnyUrl.

🧹 Nitpick comments (5)
src/app/endpoints/streaming_query.py (1)

575-749: Optional: split endpoint handler to reduce complexity

Pylint R0915 (too many statements) is tripped. Consider extracting response_generator and topic-summary persistence into helpers to quiet the linter and improve readability.

src/app/endpoints/query.py (4)

179-210: Make topic summary generation failure-safe (catch and fallback).

Wrap create_turn in try/except to avoid failing the whole request if summarization hiccups; return "" on error and log at warning. Optionally, explicitly close/cleanup the temporary agent if the client exposes it.

-    response = await agent.create_turn(
-        messages=[UserMessage(role="user", content=question)],
-        session_id=session_id,
-        stream=False,
-        toolgroups=None,
-    )
-    response = cast(Turn, response)
+    try:
+        response = await agent.create_turn(
+            messages=[UserMessage(role="user", content=question)],
+            session_id=session_id,
+            stream=False,
+            toolgroups=None,
+        )
+    except Exception as e:  # noqa: BLE001
+        logger.warning("Topic summary generation failed: %s", e)
+        return ""
+    response = cast(Turn, response)

Would you like me to check whether AsyncAgent exposes a close/finalize method and add it here?


294-304: Topic summary computed only for new conversations — good; consider avoiding extra DB hit.

You query for existence here and again inside persist_user_conversation_details. Not urgent, but you could consolidate to a single write-path that defaults to "" and only generates a summary when actually creating the row.


60-60: Use module logger per guidelines.

Prefer module-based logger naming.

-logger = logging.getLogger("app.endpoints.handlers")
+logger = logging.getLogger(__name__)

As per coding guidelines.


526-557: Return type mismatch in parse_metadata_from_text_item.

On non-TextContentItem input you return a list but the signature says Optional[ReferencedDocument]. Return None to match the type.

-def parse_metadata_from_text_item(
-    text_item: TextContentItem,
-) -> Optional[ReferencedDocument]:
+def parse_metadata_from_text_item(
+    text_item: TextContentItem,
+) -> ReferencedDocument | None:
@@
-    docs: list[ReferencedDocument] = []
-    if not isinstance(text_item, TextContentItem):
-        return docs
+    if not isinstance(text_item, TextContentItem):
+        return None
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d22af0 and 9c9a1b4.

📒 Files selected for processing (8)
  • docs/openapi.json (4 hunks)
  • src/app/endpoints/query.py (6 hunks)
  • src/app/endpoints/streaming_query.py (3 hunks)
  • src/constants.py (1 hunks)
  • src/models/responses.py (5 hunks)
  • src/utils/endpoints.py (2 hunks)
  • tests/unit/app/endpoints/test_query.py (7 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/models/responses.py
  • tests/unit/app/endpoints/test_streaming_query.py
  • src/constants.py
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • tests/unit/app/endpoints/test_query.py
  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/query.py
  • src/utils/endpoints.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/app/endpoints/test_query.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/app/endpoints/test_query.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/query.py
  • src/utils/endpoints.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/query.py
🧠 Learnings (1)
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
PR: lightspeed-core/lightspeed-stack#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/app/**/*.py : Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Applied to files:

  • src/app/endpoints/streaming_query.py
🧬 Code graph analysis (4)
tests/unit/app/endpoints/test_query.py (1)
src/app/endpoints/query.py (1)
  • get_topic_summary (179-209)
src/app/endpoints/streaming_query.py (4)
src/app/database.py (1)
  • get_session (34-40)
src/app/endpoints/query.py (2)
  • persist_user_conversation_details (102-134)
  • get_topic_summary (179-209)
src/utils/endpoints.py (1)
  • validate_model_provider_override (160-180)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/app/endpoints/query.py (4)
src/utils/endpoints.py (2)
  • get_topic_summary_system_prompt (146-157)
  • get_temp_agent (304-334)
src/configuration.py (1)
  • configuration (65-69)
src/app/database.py (1)
  • get_session (34-40)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/utils/endpoints.py (3)
src/models/config.py (2)
  • config (138-144)
  • get_prompts (442-444)
src/configuration.py (2)
  • AppConfig (36-153)
  • customization (119-123)
src/utils/suid.py (1)
  • get_suid (6-16)
🪛 GitHub Actions: Black
src/app/endpoints/streaming_query.py

[error] 1-1: Black formatting check failed. 1 file would be reformatted. Command: 'uv tool run black --check .'

🪛 GitHub Actions: Ruff
src/app/endpoints/streaming_query.py

[error] 1-1: 8 Ruff F811 redefinition errors detected: APIRouter, HTTPException, Request, Depends, status, StreamingResponse, evaluate_model_hints, validate_model_provider_override redefined in this file. Run 'ruff --fix' to auto-fix.

🪛 GitHub Actions: Python linter
src/app/endpoints/streaming_query.py

[warning] 23-23: Pylint: Reimport 'APIRouter' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'HTTPException' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'Request' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'Depends' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'status' (imported line 9) (reimported) [W0404]


[warning] 24-24: Pylint: Reimport 'StreamingResponse' (imported line 10) (reimported) [W0404]


[warning] 28-28: Pylint: Reimport 'evaluate_model_hints' (imported line 28) (reimported) [W0404]


[warning] 62-62: Pylint: Reimport 'validate_model_provider_override' (imported line 52) (reimported) [W0404]


[warning] 577-577: Pylint: Too many statements (51/50) (too-many-statements) [R0915]


[warning] 23-23: Pylint: Imports from package fastapi are not grouped (ungrouped-imports) [C0412]


[warning] 28-28: Pylint: Imports from package app are not grouped (ungrouped-imports) [C0412]

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (10)
src/utils/endpoints.py (2)

146-158: Topic-summary prompt precedence looks good

Custom profile "topic_summary" prompt takes precedence with a sane fallback to DEFAULT_TOPIC_SUMMARY_SYSTEM_PROMPT.


304-335: Docstring/return mismatch and tuple-order inconsistency with get_agent

  • The docstring says the function returns a 2-tuple, but it returns three values. Please fix the docstring.
  • Return order is (agent, session_id, conversation_id) whereas get_agent returns (agent, conversation_id, session_id). This inversion is brittle and was flagged earlier; prefer a named return to prevent call-site mistakes.

Minimal docstring fix:

@@
-    Returns:
-        tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
+    Returns:
+        tuple[AsyncAgent, str, str]: (agent, session_id, conversation_id).

Optional: adopt a NamedTuple to make ordering explicit without breaking destructuring:

+from typing import NamedTuple
+
+class TempAgentResult(NamedTuple):
+    agent: AsyncAgent
+    session_id: str
+    conversation_id: str
@@
-async def get_temp_agent(... ) -> tuple[AsyncAgent, str, str]:
+async def get_temp_agent(... ) -> TempAgentResult:
@@
-    return agent, session_id, conversation_id
+    return TempAgentResult(agent, session_id, conversation_id)

This aligns with the earlier recommendation and reduces future call-site bugs. [Based on learnings]

tests/unit/app/endpoints/test_query.py (3)

79-85: Mocking DB session for new-conversation path looks good

Context manager methods are provided; suitable for the session usage in production code.


1813-1866: New get_topic_summary tests are thorough

Good coverage for empty content, None content, interleaved content, and prompt retrieval; the agent/session parameters are asserted correctly.

Also applies to: 1869-1903, 1906-1940, 1943-1986, 1989-2031, 2034-2078, 2081-2130


210-213: Patch async get_topic_summary with AsyncMock

get_topic_summary is awaited; patching it with a regular mock will raise TypeError. Use AsyncMock. This was noted earlier and still applies to all occurrences.

-    mocker.patch(
-        "app.endpoints.query.get_topic_summary", return_value="Test topic summary"
-    )
+    mocker.patch(
+        "app.endpoints.query.get_topic_summary",
+        new_callable=mocker.AsyncMock,
+        return_value="Test topic summary",
+    )

Apply the same change to each patched occurrence in this file. [Based on learnings]

Also applies to: 1411-1413, 1466-1468, 1522-1524

docs/openapi.json (2)

31-31: Root endpoint description improvement

Clearer HTML response description; consistent with handler behavior.


1540-1554: topic_summary surfaced in ConversationDetails and examples

Schema and examples align with the new feature. Ensure backend responses include topic_summary when available.

Also applies to: 1561-1561, 1622-1642

src/app/endpoints/streaming_query.py (1)

725-735: Make topic-summary best-effort and use llama_stack_model_id

The summary call should not fail the request and should use the provider/model identifier used to create the agent (llama_stack_model_id), not the plain model_id. This was flagged previously.

-        # Get the initial topic summary for the conversation
-        topic_summary = None
+        # Get the initial topic summary for the conversation (best-effort)
+        topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
-                topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
-                )
+                try:
+                    topic_summary = await get_topic_summary(
+                        query_request.query, client, llama_stack_model_id
+                    )
+                except Exception:
+                    logger.exception(
+                        "Failed to compute topic summary for %s", conversation_id
+                    )
+                    topic_summary = None
src/app/endpoints/query.py (2)

49-51: LGTM: correct imports for topic summary helpers.

The new utilities are imported where used; no issues spotted.


325-332: Call site OK; ensure callee normalizes None topics.

Passing topic_summary=None is fine as long as persist_user_conversation_details coerces it to "" when creating a new row (see suggested fix).

Comment on lines +103 to 108
user_id: str,
conversation_id: str,
model: str,
provider_id: str,
topic_summary: Optional[str],
) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against None for non-null DB column topic_summary.

UserConversation.topic_summary is defined as non-null with default="". Passing None here can violate NOT NULL. Normalize to empty string at creation and modernize the type hint.

Apply:

-def persist_user_conversation_details(
-    user_id: str,
-    conversation_id: str,
-    model: str,
-    provider_id: str,
-    topic_summary: Optional[str],
-) -> None:
+def persist_user_conversation_details(
+    user_id: str,
+    conversation_id: str,
+    model: str,
+    provider_id: str,
+    topic_summary: str | None,
+) -> None:
@@
-            conversation = UserConversation(
+            conversation = UserConversation(
                 id=conversation_id,
                 user_id=user_id,
                 last_used_model=model,
                 last_used_provider=provider_id,
-                topic_summary=topic_summary,
+                topic_summary=topic_summary or "",
                 message_count=1,
             )

Also applies to: 116-123

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 103-108 (and also apply same change
to lines 116-123), the function accepts topic_summary: Optional[str> but
UserConversation.topic_summary is NOT NULL with default "". Normalize the value
before creating the DB object by coalescing None to an empty string (e.g.,
topic_summary = topic_summary or "") and update the type hint to str (or keep
Optional in public API but convert immediately to str internally) so the DB
always receives a non-null string.

Comment on lines 23 to 25
from fastapi import APIRouter, HTTPException, Request, Depends, status
from fastapi.responses import StreamingResponse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix re-imports causing Ruff F811 and Pylint W0404

Duplicate imports of FastAPI symbols, StreamingResponse, evaluate_model_hints, and validate_model_provider_override break the pipeline. Remove the re-imports.

Apply:

- from fastapi import APIRouter, HTTPException, Request, Depends, status
- from fastapi.responses import StreamingResponse
@@
-    persist_user_conversation_details,
-    evaluate_model_hints,
-    get_topic_summary,
+    persist_user_conversation_details,
+    get_topic_summary,
@@
-from utils.endpoints import validate_model_provider_override

Also applies to: 37-40, 62-62

🧰 Tools
🪛 GitHub Actions: Python linter

[warning] 23-23: Pylint: Reimport 'APIRouter' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'HTTPException' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'Request' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'Depends' (imported line 9) (reimported) [W0404]


[warning] 23-23: Pylint: Reimport 'status' (imported line 9) (reimported) [W0404]


[warning] 24-24: Pylint: Reimport 'StreamingResponse' (imported line 10) (reimported) [W0404]


[warning] 23-23: Pylint: Imports from package fastapi are not grouped (ungrouped-imports) [C0412]

🤖 Prompt for AI Agents
In src/app/endpoints/streaming_query.py around lines 23-25 (and also at 37-40
and line 62) there are duplicate imports of FastAPI symbols, StreamingResponse,
evaluate_model_hints, and validate_model_provider_override causing Ruff F811 /
Pylint W0404; remove the repeated import statements and consolidate all required
imports into a single import block at the top of the file so each symbol is
imported only once, update any local references to use those single imports, and
run linters to verify the duplicate-import errors are resolved.

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/app/endpoints/conversations_v2.py (1)

1-1: Fix Black formatting.

The pipeline reports this file needs Black formatting. Run black src/app/endpoints/conversations_v2.py to resolve.

src/utils/endpoints.py (1)

1-1: Fix Black formatting.

The pipeline reports this file needs Black formatting. Run black src/utils/endpoints.py to resolve.

src/app/endpoints/streaming_query.py (1)

1-1: Fix Black formatting.

The pipeline reports this file needs Black formatting. Run black src/app/endpoints/streaming_query.py to resolve.

src/app/endpoints/query.py (1)

1-1: Run Black formatter to fix code style.

The pipeline reports that Black formatting would reformat this file. Please run black src/app/endpoints/query.py to fix the formatting issues.

black src/app/endpoints/query.py
🧹 Nitpick comments (3)
src/app/endpoints/conversations_v2.py (1)

105-106: Replace magic tuple indices with destructuring for clarity.

The pattern skip_userid_check = auth[2] is repeated three times and uses a magic index that makes the code fragile to auth tuple ordering changes. Since auth[0] is already unpacked as user_id, consider destructuring all needed values at once.

Apply this pattern in each handler:

-    user_id = auth[0]
-
-    logger.info("Retrieving conversations for user %s", user_id)
-
-    skip_userid_check = auth[2]
+    user_id, _user_name, skip_userid_check, *_ = auth
+
+    logger.info("Retrieving conversations for user %s", user_id)

Alternatively, if AuthTuple is a typed structure, annotate the auth parameter properly and use named access instead of indices.

Also applies to: 137-138, 175-176

src/utils/endpoints.py (1)

192-193: Rename parameter to remove leading underscore.

The leading underscore in _skip_userid_check suggests a private/internal parameter, but store_conversation_into_cache is a public utility function. The underscore convention is typically reserved for private members or to avoid shadowing built-ins.

-    _skip_userid_check: bool
+    skip_userid_check: bool

Then update line 206:

-        cache.insert_or_append(user_id, conversation_id, cache_entry, _skip_userid_check)
+        cache.insert_or_append(user_id, conversation_id, cache_entry, skip_userid_check)
src/app/endpoints/query.py (1)

179-209: Consider adding error handling for agent operations.

The function correctly implements topic summary generation with proper defensive checks for response attributes. However, failures in agent.create_turn will propagate uncaught exceptions. While the caller handles APIConnectionError, other exceptions (e.g., timeout, validation errors) may not be handled gracefully.

Consider wrapping the agent operations in a try/except block and returning a fallback (e.g., empty string or a generic summary) on failure:

 async def get_topic_summary(
     question: str, client: AsyncLlamaStackClient, model_id: str
 ) -> str:
     """Get a topic summary for a question.
 
     Args:
         question: The question to be validated.
         client: The AsyncLlamaStackClient to use for the request.
         model_id: The ID of the model to use.
     Returns:
         str: The topic summary for the question.
     """
+    try:
         topic_summary_system_prompt = get_topic_summary_system_prompt(configuration)
         agent, session_id, _ = await get_temp_agent(
             client, model_id, topic_summary_system_prompt
         )
         response = await agent.create_turn(
             messages=[UserMessage(role="user", content=question)],
             session_id=session_id,
             stream=False,
             toolgroups=None,
         )
         response = cast(Turn, response)
         return (
             interleaved_content_as_str(response.output_message.content)
             if (
                 getattr(response, "output_message", None) is not None
                 and getattr(response.output_message, "content", None) is not None
             )
             else ""
         )
+    except Exception as e:
+        logger.warning("Failed to generate topic summary: %s", e)
+        return ""
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c9a1b4 and ed95e01.

📒 Files selected for processing (4)
  • src/app/endpoints/conversations_v2.py (6 hunks)
  • src/app/endpoints/query.py (8 hunks)
  • src/app/endpoints/streaming_query.py (3 hunks)
  • src/utils/endpoints.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/utils/endpoints.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/utils/endpoints.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
🧠 Learnings (1)
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
PR: lightspeed-core/lightspeed-stack#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/app/**/*.py : Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Applied to files:

  • src/app/endpoints/streaming_query.py
🧬 Code graph analysis (3)
src/utils/endpoints.py (3)
src/models/config.py (2)
  • config (138-144)
  • get_prompts (442-444)
src/configuration.py (1)
  • customization (119-123)
src/utils/suid.py (1)
  • get_suid (6-16)
src/app/endpoints/query.py (4)
src/utils/endpoints.py (2)
  • get_topic_summary_system_prompt (146-157)
  • get_temp_agent (305-335)
src/configuration.py (1)
  • configuration (65-69)
src/app/database.py (1)
  • get_session (34-40)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/app/endpoints/streaming_query.py (4)
src/app/database.py (1)
  • get_session (34-40)
src/app/endpoints/query.py (2)
  • persist_user_conversation_details (102-134)
  • get_topic_summary (179-209)
src/utils/endpoints.py (1)
  • validate_model_provider_override (160-180)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
🪛 GitHub Actions: Black
src/utils/endpoints.py

[error] 1-1: Black formatting would reformat this file. Run 'black' to format.

src/app/endpoints/query.py

[error] 1-1: Black formatting would reformat this file. Run 'black' to format.

src/app/endpoints/streaming_query.py

[error] 1-1: Black formatting would reformat this file. Run 'black' to format.

src/app/endpoints/conversations_v2.py

[error] 1-1: Black formatting would reformat this file. Run 'black' to format.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: bandit
  • GitHub Check: Pyright
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: check_dependencies
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: integration_tests (3.13)
  • GitHub Check: radon
  • GitHub Check: build-pr
  • GitHub Check: pydocstyle
  • GitHub Check: mypy
  • GitHub Check: Pylinter
  • GitHub Check: e2e_tests
🔇 Additional comments (7)
src/utils/endpoints.py (2)

146-158: LGTM: Topic summary prompt resolution follows precedence pattern.

The function correctly prioritizes custom profile prompts over the default constant, consistent with the existing get_system_prompt precedence logic.


305-335: Tuple ordering mismatch with get_agent risks call-site errors.

get_temp_agent returns (agent, session_id, conversation_id) on line 335, but get_agent returns (agent, conversation_id, session_id) on line 302. This inversion is easy to misuse at call sites.

Based on past review feedback, either:

Option 1 (recommended): Match get_agent ordering:

-    return agent, session_id, conversation_id
+    return agent, conversation_id, session_id

Update docstring line 320:

-        tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
+        tuple[AsyncAgent, str, str]: A tuple of (agent, conversation_id, session_id).

Option 2: Use a named tuple to make ordering explicit:

+from typing import NamedTuple
+
+class TempAgentResult(NamedTuple):
+    agent: AsyncAgent
+    conversation_id: str
+    session_id: str
+
-async def get_temp_agent(...) -> tuple[AsyncAgent, str, str]:
+async def get_temp_agent(...) -> TempAgentResult:
     ...
-    return agent, session_id, conversation_id
+    return TempAgentResult(agent, conversation_id, session_id)

Verify all call sites are updated accordingly.

#!/bin/bash
# Find all call sites of get_temp_agent to verify tuple unpacking order
rg -n -A3 -B1 'get_temp_agent\(' --type=py
src/app/endpoints/streaming_query.py (2)

23-24: Remove duplicate imports breaking the pipeline.

Lines 23-24 re-import APIRouter, HTTPException, Request, Depends, and status already imported on line 9. Lines 37-40 re-import evaluate_model_hints and persist_user_conversation_details. Line 62 re-imports validate_model_provider_override already imported on line 57. These duplicates cause Ruff F811 and Pylint W0404 errors.

-from fastapi import APIRouter, HTTPException, Request, Depends, status
-from fastapi.responses import StreamingResponse
-
 from app.database import get_session
     persist_user_conversation_details,
-    evaluate_model_hints,
     get_topic_summary,
 from utils.mcp_headers import handle_mcp_headers_with_toolgroups, mcp_headers_dependency
 from utils.transcripts import store_transcript
 from utils.types import TurnSummary
-from utils.endpoints import validate_model_provider_override

Also applies to: 37-40, 62-62


726-735: Make topic summary best-effort and use llama_stack_model_id.

The current implementation:

  1. Passes model_id to get_topic_summary instead of llama_stack_model_id (the provider-qualified model identifier).
  2. Will fail the entire request if get_topic_summary raises an exception.

Topic summary generation should be best-effort and use the correct model identifier.

         # Get the initial topic summary for the conversation
         topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
-                topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
-                )
+                try:
+                    topic_summary = await get_topic_summary(
+                        query_request.query, client, llama_stack_model_id
+                    )
+                except Exception:  # best-effort; don't fail the request
+                    logger.exception("Failed to compute topic summary for %s", conversation_id)
+                    topic_summary = None
src/app/endpoints/query.py (3)

49-50: LGTM!

The new imports support the topic summary feature and follow the project's import conventions.


102-134: Guard against None for non-null DB column topic_summary.

The function accepts topic_summary: Optional[str] but UserConversation.topic_summary is a NOT NULL column with default="". Passing None at line 121 can violate the database constraint.

Apply this diff to normalize None to empty string and modernize the type hint:

 def persist_user_conversation_details(
     user_id: str,
     conversation_id: str,
     model: str,
     provider_id: str,
-    topic_summary: Optional[str],
+    topic_summary: str | None,
 ) -> None:
     """Associate conversation to user in the database."""
     with get_session() as session:
         existing_conversation = (
             session.query(UserConversation).filter_by(id=conversation_id).first()
         )
 
         if not existing_conversation:
             conversation = UserConversation(
                 id=conversation_id,
                 user_id=user_id,
                 last_used_model=model,
                 last_used_provider=provider_id,
-                topic_summary=topic_summary,
+                topic_summary=topic_summary or "",
                 message_count=1,
             )

243-243: LGTM!

Correctly captures and uses _skip_userid_check from the auth tuple.

Also applies to: 342-342

Comment on lines +294 to +303
# Get the initial topic summary for the conversation
topic_summary = None
with get_session() as session:
existing_conversation = (
session.query(UserConversation).filter_by(id=conversation_id).first()
)
if not existing_conversation:
topic_summary = await get_topic_summary(
query_request.query, client, model_id
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use llama_stack_model_id instead of model_id for topic summary generation.

The code passes model_id (line 302) to get_topic_summary, which then calls get_temp_agent. However, model_id at this point is the short model label (e.g., "llama3"), not the full Llama Stack identifier. get_temp_agent creates an AsyncAgent with the model parameter, which expects the full provider/model format (e.g., "provider/llama3").

Apply this diff:

         # Get the initial topic summary for the conversation
         topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
                 topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
+                    query_request.query, client, llama_stack_model_id
                 )

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 294 to 303, the code calls
get_topic_summary with model_id (the short label) which is incorrect because
get_temp_agent expects the full Llama Stack identifier; replace the model_id
argument with llama_stack_model_id when calling get_topic_summary, ensuring
llama_stack_model_id is in scope (or compute/lookup it beforehand) so
get_temp_agent/AsyncAgent receives the full provider/model string.

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/utils/endpoints.py (1)

184-213: Update docstring to document new parameters.

The function signature added two new parameters (_skip_userid_check and topic_summary), but the docstring doesn't document them. Per coding guidelines, all function parameters should be documented with complete Args sections.

Apply this diff to update the docstring:

 def store_conversation_into_cache(
     config: AppConfig,
     user_id: str,
     conversation_id: str,
     provider_id: str,
     model_id: str,
     query: str,
     response: str,
     _skip_userid_check: bool,
     topic_summary: str | None,
 ) -> None:
-    """Store one part of conversation into conversation history cache."""
+    """Store one part of conversation into conversation history cache.
+
+    Args:
+        config: Application configuration containing cache settings.
+        user_id: User identification.
+        conversation_id: Conversation ID unique for given user.
+        provider_id: Provider identification.
+        model_id: Model identification.
+        query: User query string.
+        response: LLM response string.
+        _skip_userid_check: Whether to skip user_id validation checks.
+        topic_summary: Optional topic summary to store for the conversation.
+    """

As per coding guidelines.

src/cache/postgres_cache.py (1)

304-334: Fix line length to resolve pipeline failure.

The implementation is correct, but line 314 exceeds the 100-character limit (116 characters).

Apply this diff to fix the linting issue:

         Returns:
-            A list of ConversationData objects containing conversation_id, topic_summary, and last_message_timestamp
+            A list of ConversationData objects containing conversation_id,
+            topic_summary, and last_message_timestamp.
🧹 Nitpick comments (6)
src/app/endpoints/streaming_query.py (2)

24-24: Group imports from the same package.

The import from app.database should be grouped with other app imports (currently at lines 26-49) to satisfy the linter's ungrouped-imports check.

Apply this diff:

-
-from app.database import get_session
 import metrics
 from app.endpoints.query import (
+    get_session,
     get_rag_toolgroups,

Wait, that's not right since get_session is from app.database, not app.endpoints.query. Let me reconsider:

+from app.database import get_session
+
 import metrics
 from app.endpoints.query import (
     get_rag_toolgroups,

Actually, looking at the structure, line 24 should be moved to be adjacent to the other app.* imports. The correct fix is:

-
-from app.database import get_session
 import metrics
+from app.database import get_session
 from app.endpoints.query import (

571-596: Consider extracting topic summary logic to reduce function complexity.

The streaming_query_endpoint_handler function has 51 statements (threshold is 50). While this is a minor overage, consider extracting the topic summary generation logic (lines 711-722) into a helper function to improve readability and reduce complexity.

src/cache/sqlite_cache.py (1)

307-307: Wrap the docstring line to comply with line length limit.

Line 307 exceeds the 100-character limit (116 characters). Wrap the docstring for better readability.

Apply this diff:

-            A list of ConversationData objects containing conversation_id, topic_summary, and last_message_timestamp
+            A list of ConversationData objects containing conversation_id,
+            topic_summary, and last_message_timestamp
src/models/responses.py (1)

684-691: Breaking change: verify API version handling.

The field type changes from list[str] to list[ConversationData], which is a breaking change to the API contract. The model name includes "V2" suffix, which suggests this is intentional versioning.

Optionally, consider adding model_config with json_schema_extra examples to help API consumers understand the new structure, similar to other response models in this file.

src/utils/endpoints.py (1)

146-158: Enhance the docstring per coding guidelines.

The function logic is correct, but the docstring could be more descriptive. Consider documenting the precedence order and adding Args/Returns sections per Google Python docstring style.

Apply this diff to enhance the docstring:

-def get_topic_summary_system_prompt(config: AppConfig) -> str:
-    """Get the topic summary system prompt."""
+def get_topic_summary_system_prompt(config: AppConfig) -> str:
+    """Resolve which system prompt to use for topic summary generation.
+
+    Precedence:
+    1. If the application configuration provides a custom_profile with a
+       'topic_summary' prompt, that value is returned.
+    2. Otherwise the module default `constants.DEFAULT_TOPIC_SUMMARY_SYSTEM_PROMPT`
+       is returned.
+
+    Args:
+        config: Application configuration which may include a custom_profile
+                with prompts.
+
+    Returns:
+        str: The resolved system prompt for topic summary generation.
+    """
src/cache/postgres_cache.py (1)

49-57: Consider adding an index on last_message_timestamp.

The schema is correct, but since last_message_timestamp is used in ORDER BY in the LIST_CONVERSATIONS_STATEMENT (line 89), an index could improve query performance for users with many conversations.

Optionally add an index:

CREATE_CONVERSATIONS_TIMESTAMP_INDEX = """
    CREATE INDEX IF NOT EXISTS conversations_timestamp
        ON conversations (user_id, last_message_timestamp DESC)
    """

Then execute it in initialize_cache() after creating the table.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed95e01 and c9769e8.

📒 Files selected for processing (14)
  • docs/openapi.json (8 hunks)
  • src/app/endpoints/conversations_v2.py (8 hunks)
  • src/app/endpoints/query.py (8 hunks)
  • src/app/endpoints/streaming_query.py (3 hunks)
  • src/cache/cache.py (2 hunks)
  • src/cache/in_memory_cache.py (3 hunks)
  • src/cache/noop_cache.py (3 hunks)
  • src/cache/postgres_cache.py (7 hunks)
  • src/cache/sqlite_cache.py (7 hunks)
  • src/models/cache_entry.py (1 hunks)
  • src/models/responses.py (7 hunks)
  • src/utils/endpoints.py (4 hunks)
  • tests/unit/cache/test_postgres_cache.py (2 hunks)
  • tests/unit/cache/test_sqlite_cache.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/streaming_query.py
  • src/models/cache_entry.py
  • src/cache/in_memory_cache.py
  • src/cache/postgres_cache.py
  • src/cache/sqlite_cache.py
  • src/app/endpoints/conversations_v2.py
  • src/models/responses.py
  • src/utils/endpoints.py
  • src/app/endpoints/query.py
  • src/cache/noop_cache.py
  • src/cache/cache.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/app/endpoints/streaming_query.py
  • src/models/cache_entry.py
  • src/cache/in_memory_cache.py
  • src/cache/postgres_cache.py
  • src/cache/sqlite_cache.py
  • src/app/endpoints/conversations_v2.py
  • src/models/responses.py
  • src/utils/endpoints.py
  • src/app/endpoints/query.py
  • src/cache/noop_cache.py
  • src/cache/cache.py
  • tests/unit/cache/test_sqlite_cache.py
  • tests/unit/cache/test_postgres_cache.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/streaming_query.py
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

  • src/models/cache_entry.py
  • src/models/responses.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/cache_entry.py
  • src/models/responses.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/cache/test_sqlite_cache.py
  • tests/unit/cache/test_postgres_cache.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/cache/test_sqlite_cache.py
  • tests/unit/cache/test_postgres_cache.py
🧬 Code graph analysis (12)
src/app/endpoints/streaming_query.py (4)
src/app/database.py (1)
  • get_session (34-40)
src/app/endpoints/query.py (2)
  • persist_user_conversation_details (102-134)
  • get_topic_summary (179-209)
src/utils/endpoints.py (1)
  • validate_conversation_ownership (41-61)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/cache/in_memory_cache.py (6)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (2)
  • set_topic_summary (105-119)
  • construct_key (37-43)
src/cache/noop_cache.py (1)
  • set_topic_summary (103-119)
src/cache/postgres_cache.py (1)
  • set_topic_summary (337-364)
src/cache/sqlite_cache.py (1)
  • set_topic_summary (331-356)
src/cache/postgres_cache.py (6)
src/models/cache_entry.py (2)
  • CacheEntry (6-19)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (1)
  • set_topic_summary (105-119)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/noop_cache.py (1)
  • set_topic_summary (103-119)
src/cache/sqlite_cache.py (1)
  • set_topic_summary (331-356)
src/cache/sqlite_cache.py (5)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (1)
  • set_topic_summary (105-119)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/postgres_cache.py (1)
  • set_topic_summary (337-364)
src/app/endpoints/conversations_v2.py (5)
src/cache/cache.py (2)
  • get (46-58)
  • delete (78-90)
src/cache/in_memory_cache.py (2)
  • get (31-46)
  • delete (69-85)
src/cache/noop_cache.py (2)
  • get (29-44)
  • delete (67-83)
src/cache/postgres_cache.py (2)
  • get (184-217)
  • delete (266-301)
src/cache/sqlite_cache.py (2)
  • get (179-213)
  • delete (261-294)
src/models/responses.py (1)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/endpoints.py (8)
src/models/config.py (2)
  • config (138-144)
  • get_prompts (442-444)
src/configuration.py (1)
  • customization (119-123)
src/cache/cache.py (3)
  • get (46-58)
  • insert_or_append (61-75)
  • set_topic_summary (105-119)
src/cache/in_memory_cache.py (3)
  • get (31-46)
  • insert_or_append (49-66)
  • set_topic_summary (105-121)
src/cache/noop_cache.py (3)
  • get (29-44)
  • insert_or_append (47-64)
  • set_topic_summary (103-119)
src/cache/postgres_cache.py (3)
  • get (184-217)
  • insert_or_append (220-263)
  • set_topic_summary (337-364)
src/cache/sqlite_cache.py (3)
  • get (179-213)
  • insert_or_append (216-258)
  • set_topic_summary (331-356)
src/utils/suid.py (1)
  • get_suid (6-16)
src/app/endpoints/query.py (4)
src/utils/endpoints.py (2)
  • get_topic_summary_system_prompt (146-157)
  • get_temp_agent (312-342)
src/configuration.py (1)
  • configuration (65-69)
src/app/database.py (1)
  • get_session (34-40)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/cache/noop_cache.py (4)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (2)
  • set_topic_summary (105-119)
  • construct_key (37-43)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/cache.py (5)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/noop_cache.py (1)
  • set_topic_summary (103-119)
src/cache/postgres_cache.py (1)
  • set_topic_summary (337-364)
src/cache/sqlite_cache.py (1)
  • set_topic_summary (331-356)
tests/unit/cache/test_sqlite_cache.py (3)
src/models/cache_entry.py (2)
  • CacheEntry (6-19)
  • ConversationData (22-33)
src/cache/sqlite_cache.py (5)
  • insert_or_append (216-258)
  • set_topic_summary (331-356)
  • get (179-213)
  • delete (261-294)
  • connect (120-135)
src/cache/cache_error.py (1)
  • CacheError (4-5)
tests/unit/cache/test_postgres_cache.py (3)
src/models/cache_entry.py (2)
  • CacheEntry (6-19)
  • ConversationData (22-33)
src/cache/postgres_cache.py (5)
  • PostgresCache (15-372)
  • set_topic_summary (337-364)
  • insert_or_append (220-263)
  • delete (266-301)
  • connect (120-144)
src/utils/connection_decorator.py (1)
  • connection (6-22)
🪛 GitHub Actions: Python linter
src/app/endpoints/streaming_query.py

[error] 573-573: pylint: Too many statements (51/50) (R0915)


[warning] 26-26: pylint: Imports from package app are not grouped (C0412) - ungrouped-imports

src/cache/postgres_cache.py

[warning] 314-314: pylint: Line too long (116/100) (C0301) - line-too-long

src/cache/sqlite_cache.py

[warning] 307-307: pylint: Line too long (116/100) (C0301) - line-too-long

src/cache/cache.py

[warning] 101-101: pylint: Line too long (116/100) (C0301) - line-too-long

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (42)
src/models/cache_entry.py (1)

22-33: LGTM! Well-defined data model.

The ConversationData model is properly implemented with:

  • Clear type annotations using modern union syntax
  • Complete docstring following Google style
  • Appropriate field types for the topic summary feature
src/cache/noop_cache.py (1)

102-119: LGTM! Consistent no-op implementation.

The set_topic_summary method correctly follows the NoopCache pattern:

  • Validates input parameters via construct_key
  • Uses @connection decorator consistently
  • Complete docstring with proper Args section
src/app/endpoints/conversations_v2.py (2)

230-231: LGTM! Correct adaptation to ConversationData.

The code correctly extracts conversation_id from the list of ConversationData objects, which is necessary after the cache API update.


107-107: No issues found with auth[2] access.

The code correctly accesses auth[2] to extract skip_userid_check. AuthTuple is defined as tuple[UserID, UserName, SkipUserIdCheck, Token], guaranteeing exactly 4 elements where index 2 is the SkipUserIdCheck boolean. The type system enforces this structure across all authentication implementations, eliminating any IndexError risk.

src/cache/cache.py (1)

104-119: LGTM! Well-defined abstract method.

The set_topic_summary abstract method is properly defined with:

  • Clear parameter types
  • Complete docstring with Args section
  • Consistent with other abstract methods in the class
tests/unit/cache/test_sqlite_cache.py (5)

10-10: LGTM!

The import of ConversationData is necessary to support the new test assertions that verify topic_summary functionality.


191-191: LGTM!

Adding a type assertion for the return value improves test coverage and ensures the contract of the list() method.


261-288: LGTM!

This test properly validates that list() returns ConversationData objects with topic summaries and that results are ordered by last_message_timestamp in descending order.


290-314: LGTM!

This test validates both initial setting and updating of topic summaries, with proper verification via the list() method.


316-349: LGTM!

Both tests properly validate critical behaviors:

  • Topic summaries are deleted when conversations are deleted (cascade behavior).
  • set_topic_summary raises CacheError when the cache is disconnected.
src/cache/in_memory_cache.py (3)

4-4: LGTM!

The ConversationData import is necessary for the updated return type of the list() method.


88-102: LGTM!

The signature change to return list[ConversationData] aligns with the updated cache interface and other implementations.


104-122: LGTM!

The set_topic_summary implementation is appropriate for an in-memory cache (no-op with validation only), consistent with other no-op methods in this class.

tests/unit/cache/test_postgres_cache.py (3)

10-10: LGTM!

The import of ConversationData is necessary for the new test assertions that verify topic_summary functionality.


304-304: LGTM!

Adding a type assertion improves test coverage and ensures the contract of the list() method.


307-367: LGTM!

All three new tests properly validate topic_summary operations for PostgreSQL cache:

  • Setting and retrieving topic summaries via list().
  • Deletion behavior when conversations are deleted.
  • Error handling when the cache is disconnected.

The mocking strategy is appropriate for unit tests.

src/app/endpoints/streaming_query.py (4)

24-24: LGTM!

The import of get_session is necessary to query the database for existing conversations when determining whether to generate a topic summary.


34-36: LGTM!

The imports from app.endpoints.query are properly organized and include the necessary functions for topic summary generation and conversation persistence.


724-742: LGTM!

The topic summary is correctly passed to both store_conversation_into_cache and persist_user_conversation_details, ensuring it's persisted in both the cache and database.


711-722: Use llama_stack_model_id and add error handling for topic summary generation.

The topic summary generation has two issues identified in past reviews:

  1. Line 721 passes model_id but should pass llama_stack_model_id (the provider/model identifier used for agent creation).
  2. No error handling—if get_topic_summary fails, the entire streaming request fails. Topic summaries should be best-effort.

Apply this diff:

-            # Get the initial topic summary for the conversation
             topic_summary = None
             with get_session() as session:
                 existing_conversation = (
                     session.query(UserConversation)
                     .filter_by(id=conversation_id)
                     .first()
                 )
                 if not existing_conversation:
-                    topic_summary = await get_topic_summary(
-                        query_request.query, client, model_id
-                    )
+                    try:
+                        # Use the provider/model identifier for temp agent creation
+                        topic_summary = await get_topic_summary(
+                            query_request.query, client, llama_stack_model_id
+                        )
+                    except Exception:  # best-effort; don't fail the request
+                        logger.exception("Failed to compute topic summary for %s", conversation_id)
+                        topic_summary = None
src/cache/sqlite_cache.py (7)

9-9: LGTM!

The ConversationData import is necessary for the updated return type of the list() method.


53-61: LGTM!

The new conversations table schema properly tracks conversation metadata including topic_summary and last_message_timestamp, with an appropriate primary key on (user_id, conversation_id).


90-109: LGTM!

The SQL statements properly support the new conversations table:

  • LIST_CONVERSATIONS_STATEMENT selects the required fields and orders by last_message_timestamp DESC.
  • INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT and UPSERT_CONVERSATION_STATEMENT use SQLite's INSERT OR REPLACE for idempotent upserts.
  • DELETE_CONVERSATION_STATEMENT enables cleanup.

237-255: LGTM!

The insert_or_append method correctly updates the conversations table with last_message_timestamp after inserting cache entries, ensuring the conversations table stays synchronized.


285-290: LGTM!

The cascading deletion of conversation records when deleting cache entries ensures data consistency.


297-328: LGTM!

The list() method correctly constructs ConversationData objects from the query results, including conversation_id, topic_summary, and last_message_timestamp.


330-356: LGTM!

The set_topic_summary method properly implements upsert logic with timestamp tracking and appropriate error handling when disconnected.

src/models/responses.py (3)

7-7: LGTM!

The import follows the absolute import style and correctly supports the new list[ConversationData] type annotation.


561-617: LGTM!

The topic_summary field is properly added with correct type annotation, clear description, and meaningful example. The docstring documentation is complete and follows the established pattern.


637-680: LGTM!

The examples in both the docstring and json_schema_extra consistently include the new topic_summary field with meaningful sample values.

src/cache/postgres_cache.py (6)

7-7: LGTM!

The import correctly adds ConversationData to support the new return type for the list() method.


86-109: LGTM!

The SQL statements are well-structured and correctly use PostgreSQL features like ON CONFLICT for upserts and EXTRACT(EPOCH FROM ...) for timestamp conversion.


174-175: LGTM!

The conversations table initialization follows the established pattern and is correctly placed in the initialization sequence.


254-260: LGTM!

The upsert correctly updates the conversation's last_message_timestamp without overwriting the topic_summary. The transaction handling ensures consistency.


291-296: LGTM!

The deletion correctly removes both cache entries and conversation metadata within the same transaction, maintaining consistency.


336-364: LGTM!

The set_topic_summary method correctly implements the abstract interface with appropriate error handling and follows the established patterns in this class.

src/utils/endpoints.py (1)

312-342: Fix docstring mismatch and address tuple ordering inconsistency.

The function has two confirmed issues:

  1. Docstring return type mismatch: The Returns section documents tuple[AsyncAgent, str] but the function signature and implementation return tuple[AsyncAgent, str, str] (three elements: agent, session_id, conversation_id).

  2. Tuple ordering inconsistency: get_temp_agent returns (agent, session_id, conversation_id) while get_agent (line 309) returns (agent, conversation_id, session_id). The inverted order creates maintainability risk—developers may expect consistent ordering between related functions, potentially causing future bugs.

Apply this diff to fix the docstring:

     Returns:
-        tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
+        tuple[AsyncAgent, str, str]: A tuple containing (agent, session_id, conversation_id).

Consider addressing the ordering inconsistency by either:

  • Swapping the return order to match get_agent: return agent, conversation_id, session_id
  • Or using a named tuple/dataclass for type safety

Current call site at src/app/endpoints/query.py:192 correctly unpacks as agent, session_id, _ so changing the order would require updating that line accordingly.

src/app/endpoints/query.py (5)

49-50: LGTM!

The new imports support the topic summary feature and are correctly sourced from utils.endpoints.


103-123: Guard against None for non-null DB column topic_summary.

The function accepts topic_summary: Optional[str] but UserConversation.topic_summary is defined as Mapped[str] with default="" (non-null). Passing None directly to the constructor can violate the NOT NULL constraint.

Apply this diff to normalize None to empty string and modernize the type hint:

 def persist_user_conversation_details(
     user_id: str,
     conversation_id: str,
     model: str,
     provider_id: str,
-    topic_summary: Optional[str],
+    topic_summary: str | None,
 ) -> None:
     """Associate conversation to user in the database."""
     with get_session() as session:
         existing_conversation = (
             session.query(UserConversation).filter_by(id=conversation_id).first()
         )
 
         if not existing_conversation:
             conversation = UserConversation(
                 id=conversation_id,
                 user_id=user_id,
                 last_used_model=model,
                 last_used_provider=provider_id,
-                topic_summary=topic_summary,
+                topic_summary=topic_summary or "",
                 message_count=1,
             )

179-209: LGTM!

The get_topic_summary function is well-implemented with proper async handling, defensive checks for missing response fields, and appropriate type annotations.


243-243: LGTM!

The auth tuple unpacking and topic_summary parameter passing to persistence and cache functions are correctly implemented.

Also applies to: 331-331, 343-343


294-303: Use llama_stack_model_id instead of model_id for topic summary generation.

The code passes model_id (line 302) to get_topic_summary, which then calls get_temp_agent. However, model_id at this point is the short model label (e.g., "llama3"), not the full Llama Stack identifier. get_temp_agent creates an AsyncAgent with the model parameter, which expects the full provider/model format (e.g., "provider/llama3").

Apply this diff:

         # Get the initial topic summary for the conversation
         topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
                 topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
+                    query_request.query, client, llama_stack_model_id
                 )

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/app/endpoints/streaming_query.py (1)

572-572: Refactor to reduce statement count.

The function exceeds the maximum allowed statements (51/50). Consider extracting the topic summary computation logic (lines 710-721) into a separate helper function to reduce complexity.

Example refactor:

async def _compute_topic_summary_if_new(
    conversation_id: str,
    query: str,
    client: AsyncLlamaStackClient,
    llama_stack_model_id: str,
) -> str | None:
    """Compute topic summary for new conversations."""
    topic_summary = None
    with get_session() as session:
        existing_conversation = (
            session.query(UserConversation)
            .filter_by(id=conversation_id)
            .first()
        )
        if not existing_conversation:
            try:
                topic_summary = await get_topic_summary(
                    query, client, llama_stack_model_id
                )
            except Exception:
                logger.exception("Failed to compute topic summary for %s", conversation_id)
                topic_summary = None
    return topic_summary

Then call it after the stream completes:

topic_summary = await _compute_topic_summary_if_new(
    conversation_id, query_request.query, client, llama_stack_model_id
)
src/cache/cache.py (1)

101-103: Polish the docstring formatting.

The Returns description is split across two lines in a way that reads awkwardly. Consider rephrasing for clarity.

Apply this diff:

         Returns:
-            A list of ConversationData objects containing conversation_id, topic_summary, and
-            last_message_timestamp
+            A list of ConversationData objects containing conversation_id,
+            topic_summary, and last_message_timestamp.
src/cache/sqlite_cache.py (1)

331-357: Consider adding user_id/conversation_id validation.

While the implementation is functionally correct, other cache implementations (postgres_cache.py, in_memory_cache.py, noop_cache.py) call super().construct_key(user_id, conversation_id, skip_user_id_check) to validate that user_id and conversation_id are valid UUIDs. This validation step is missing here.

Add validation at the beginning of the method, after the connection check:

     if self.connection is None:
         logger.error("Cache is disconnected")
         raise CacheError("set_topic_summary: cache is disconnected")
+    
+    # Validate user_id and conversation_id are UUIDs
+    super().construct_key(user_id, conversation_id, skip_user_id_check)

     cursor = self.connection.cursor()

Based on learnings from other cache implementations in the codebase.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9769e8 and 630cbf8.

📒 Files selected for processing (4)
  • src/app/endpoints/streaming_query.py (3 hunks)
  • src/cache/cache.py (2 hunks)
  • src/cache/postgres_cache.py (7 hunks)
  • src/cache/sqlite_cache.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/cache/postgres_cache.py
  • src/cache/cache.py
  • src/app/endpoints/streaming_query.py
  • src/cache/sqlite_cache.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/cache/postgres_cache.py
  • src/cache/cache.py
  • src/app/endpoints/streaming_query.py
  • src/cache/sqlite_cache.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/streaming_query.py
🧬 Code graph analysis (4)
src/cache/postgres_cache.py (3)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (1)
  • set_topic_summary (106-120)
src/cache/cache.py (5)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/cache/postgres_cache.py (1)
  • set_topic_summary (338-365)
src/cache/sqlite_cache.py (1)
  • set_topic_summary (332-357)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/noop_cache.py (1)
  • set_topic_summary (103-119)
src/app/endpoints/streaming_query.py (4)
src/app/database.py (1)
  • get_session (34-40)
src/app/endpoints/query.py (3)
  • persist_user_conversation_details (102-134)
  • evaluate_model_hints (137-176)
  • get_topic_summary (179-209)
src/utils/endpoints.py (1)
  • validate_conversation_ownership (41-61)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/cache/sqlite_cache.py (6)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (1)
  • set_topic_summary (106-120)
src/cache/postgres_cache.py (1)
  • set_topic_summary (338-365)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/noop_cache.py (1)
  • set_topic_summary (103-119)
🪛 GitHub Actions: Python linter
src/app/endpoints/streaming_query.py

[error] 572-572: R0915: Too many statements (51/50) (too-many-statements)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e_tests
  • GitHub Check: build-pr
🔇 Additional comments (16)
src/cache/cache.py (1)

105-120: LGTM: Well-defined abstract method.

The set_topic_summary abstract method is well-documented with clear parameter descriptions and follows the existing patterns in the base class.

src/cache/postgres_cache.py (7)

49-57: LGTM: Well-designed conversations table.

The conversations table schema is well-structured with:

  • Composite primary key on (user_id, conversation_id)
  • Optional topic_summary field
  • Required last_message_timestamp for ordering

This supports efficient lookups and the new topic summary feature.


86-90: LGTM: Efficient conversation listing query.

The query correctly:

  • Extracts Unix epoch timestamp for consistent serialization
  • Orders by last_message_timestamp DESC (most recent first)
  • Returns the required fields for ConversationData

104-109: LGTM: Correct "touch" semantics for conversation updates.

The UPSERT_CONVERSATION_STATEMENT correctly implements a "touch" operation:

  • Creates new conversation entries with NULL topic_summary
  • Updates only last_message_timestamp on conflict (preserves existing topic_summary)

This ensures message activity updates timestamps without overwriting topic summaries set via set_topic_summary.


254-259: LGTM: Conversation timestamp maintenance.

The insert_or_append method correctly maintains conversation metadata by upserting a conversation record with each message insertion. Passing None for topic_summary ensures existing summaries are preserved.


291-296: LGTM: Cascading delete for conversation cleanup.

The delete method correctly removes entries from both the cache and conversations tables, maintaining referential consistency.


304-335: LGTM: Correct ConversationData construction.

The list method correctly:

  • Converts PostgreSQL numeric (from EXTRACT(EPOCH)) to Python float
  • Constructs ConversationData objects with all required fields
  • Returns the updated type list[ConversationData]

337-365: LGTM: Robust topic summary storage.

The set_topic_summary method:

  • Properly upserts topic_summary and last_message_timestamp
  • Includes appropriate error handling and logging
  • Raises CacheError for database failures
src/app/endpoints/streaming_query.py (1)

710-721: Use llama_stack_model_id and wrap topic summary computation in try/except.

The topic summary generation has two issues:

  1. Wrong model identifier: Line 720 passes model_id to get_topic_summary, but should pass llama_stack_model_id (the provider/model identifier used for agent creation).
  2. Missing error handling: The call to get_topic_summary is not wrapped in a try/except block. If it fails, the entire streaming request will fail. Topic summaries should be best-effort—log errors and continue.

Apply this diff:

             # Get the initial topic summary for the conversation
             topic_summary = None
             with get_session() as session:
                 existing_conversation = (
                     session.query(UserConversation)
                     .filter_by(id=conversation_id)
                     .first()
                 )
                 if not existing_conversation:
-                    topic_summary = await get_topic_summary(
-                        query_request.query, client, model_id
-                    )
+                    try:
+                        # Use the actual agent model identifier for temp agent creation
+                        topic_summary = await get_topic_summary(
+                            query_request.query, client, llama_stack_model_id
+                        )
+                    except Exception:  # be resilient; summary is best-effort
+                        logger.exception("Failed to compute topic summary for %s", conversation_id)
+                        topic_summary = None
src/cache/sqlite_cache.py (7)

9-9: LGTM!

The addition of ConversationData to imports is correct and necessary for the new conversation listing functionality.


53-61: LGTM!

The conversations table schema is well-designed with appropriate columns and a composite primary key on (user_id, conversation_id).


89-94: LGTM!

The query correctly selects conversation metadata from the conversations table and orders by most recent first.


169-170: LGTM!

Conversations table creation is properly integrated into the initialization flow with appropriate logging.


237-255: LGTM!

The implementation correctly captures a single timestamp for both the cache entry and conversation update, ensuring consistency. Passing None for topic_summary in the upsert is appropriate since summaries are set via the dedicated set_topic_summary method.


285-290: LGTM!

Correctly maintains referential consistency by deleting both cache entries and the conversation record.


297-329: LGTM!

The method signature, type annotations, docstring, and implementation are all correct. The conversion from raw tuples to ConversationData objects properly matches the query's column order.

Comment on lines 96 to 109
INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
VALUES (?, ?, ?, ?)
"""

DELETE_CONVERSATION_STATEMENT = """
DELETE FROM conversations
WHERE user_id=? AND conversation_id=?
"""

UPSERT_CONVERSATION_STATEMENT = """
INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
VALUES (?, ?, ?, ?)
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Eliminate duplicate SQL statement.

UPSERT_CONVERSATION_STATEMENT (lines 106-109) is identical to INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT (lines 96-99). This violates DRY principles and creates maintenance burden.

Apply this diff to remove the duplication:

 INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
     INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
     VALUES (?, ?, ?, ?)
     """

-DELETE_CONVERSATION_STATEMENT = """
-    DELETE FROM conversations
-     WHERE user_id=? AND conversation_id=?
-    """
-
-UPSERT_CONVERSATION_STATEMENT = """
+UPSERT_CONVERSATION_STATEMENT = INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT
+
+DELETE_CONVERSATION_STATEMENT = """
+    DELETE FROM conversations
+     WHERE user_id=? AND conversation_id=?
+    """
-    INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
-    VALUES (?, ?, ?, ?)
-    """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
VALUES (?, ?, ?, ?)
"""
DELETE_CONVERSATION_STATEMENT = """
DELETE FROM conversations
WHERE user_id=? AND conversation_id=?
"""
UPSERT_CONVERSATION_STATEMENT = """
INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
VALUES (?, ?, ?, ?)
"""
INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
VALUES (?, ?, ?, ?)
"""
UPSERT_CONVERSATION_STATEMENT = INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT
DELETE_CONVERSATION_STATEMENT = """
DELETE FROM conversations
WHERE user_id=? AND conversation_id=?
"""
🤖 Prompt for AI Agents
In src/cache/sqlite_cache.py around lines 96 to 109,
UPSERT_CONVERSATION_STATEMENT is a duplicate of
INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT; delete the
UPSERT_CONVERSATION_STATEMENT constant and update any code that references it to
use INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT instead (or import/alias the single
constant where needed) so there is a single source of truth for the SQL
statement.

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/cache/sqlite_cache.py (1)

53-61: Consider adding an index on last_message_timestamp.

The LIST_CONVERSATIONS_STATEMENT orders by last_message_timestamp DESC, but there's no index on this column. For better query performance as the conversations table grows, consider adding an index.

Add an index creation statement similar to the existing CREATE_INDEX:

CREATE_CONVERSATIONS_INDEX = """
    CREATE INDEX IF NOT EXISTS conversations_timestamps
        ON conversations (user_id, last_message_timestamp DESC)
    """

Then execute it in initialize_cache() after creating the conversations table:

 logger.info("Initializing table for conversations")
 cursor.execute(SQLiteCache.CREATE_CONVERSATIONS_TABLE)

+logger.info("Initializing index for conversations")
+cursor.execute(SQLiteCache.CREATE_CONVERSATIONS_INDEX)
+
 logger.info("Initializing index for cache")
 cursor.execute(SQLiteCache.CREATE_INDEX)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 630cbf8 and c91471a.

📒 Files selected for processing (4)
  • src/app/endpoints/streaming_query.py (4 hunks)
  • src/cache/cache.py (2 hunks)
  • src/cache/postgres_cache.py (7 hunks)
  • src/cache/sqlite_cache.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/streaming_query.py
  • src/cache/cache.py
  • src/cache/postgres_cache.py
  • src/cache/sqlite_cache.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/streaming_query.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/app/endpoints/streaming_query.py
  • src/cache/cache.py
  • src/cache/postgres_cache.py
  • src/cache/sqlite_cache.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/streaming_query.py
🧬 Code graph analysis (3)
src/app/endpoints/streaming_query.py (3)
src/app/database.py (1)
  • get_session (34-40)
src/app/endpoints/query.py (2)
  • persist_user_conversation_details (102-134)
  • get_topic_summary (179-209)
src/models/database/conversations.py (1)
  • UserConversation (11-38)
src/cache/postgres_cache.py (3)
src/models/cache_entry.py (2)
  • CacheEntry (6-19)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/sqlite_cache.py (1)
  • set_topic_summary (332-357)
src/cache/sqlite_cache.py (5)
src/models/cache_entry.py (2)
  • CacheEntry (6-19)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/postgres_cache.py (1)
  • set_topic_summary (338-365)
src/cache/in_memory_cache.py (1)
  • set_topic_summary (105-121)
src/cache/noop_cache.py (1)
  • set_topic_summary (103-119)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (11)
src/cache/cache.py (3)

5-5: LGTM: Import addition is correct.

The ConversationData import is necessary to support the updated list() return type.


93-103: LGTM: Interface change is well-documented.

The list() method signature change from list[str] to list[ConversationData] is a breaking change to the abstract interface, but it's properly documented and aligns with the PR objectives to return topic summaries alongside conversation IDs.


105-120: LGTM: New abstract method is well-defined.

The set_topic_summary method provides a clear interface for storing topic summaries. The signature is consistent with other cache methods and properly documents all parameters.

src/app/endpoints/streaming_query.py (1)

710-721: Harden topic summary computation and use llama_stack_model_id.

The past review comment recommended wrapping get_topic_summary in a try/except block and passing llama_stack_model_id instead of model_id. Currently, line 720 uses model_id and lacks error handling, so a failure will disrupt the request.

Apply this diff:

             # Get the initial topic summary for the conversation
             topic_summary = None
             with get_session() as session:
                 existing_conversation = (
                     session.query(UserConversation)
                     .filter_by(id=conversation_id)
                     .first()
                 )
                 if not existing_conversation:
-                    topic_summary = await get_topic_summary(
-                        query_request.query, client, model_id
-                    )
+                    try:
+                        # Use the actual agent model identifier for temp agent creation
+                        topic_summary = await get_topic_summary(
+                            query_request.query, client, llama_stack_model_id
+                        )
+                    except Exception:  # be resilient; summary is best-effort
+                        logger.exception("Failed to compute topic summary for %s", conversation_id)
+                        topic_summary = None
src/cache/postgres_cache.py (1)

92-109: Remove duplicate SQL statement and fix topic_summary overwrite bug.

UPSERT_CONVERSATION_STATEMENT (lines 104-109) is identical to INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT (lines 92-97). Additionally, line 258 in insert_or_append passes None for topic_summary, which will overwrite any existing topic summary. Instead, insert_or_append should only update last_message_timestamp without touching topic_summary.

Apply this diff to consolidate the statements and fix the upsert logic:

 INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
     INSERT INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
     VALUES (%s, %s, %s, CURRENT_TIMESTAMP)
     ON CONFLICT (user_id, conversation_id)
     DO UPDATE SET topic_summary = EXCLUDED.topic_summary, last_message_timestamp = EXCLUDED.last_message_timestamp
     """

 DELETE_CONVERSATION_STATEMENT = """
     DELETE FROM conversations
      WHERE user_id=%s AND conversation_id=%s
     """

-UPSERT_CONVERSATION_STATEMENT = """
-    INSERT INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
-    VALUES (%s, %s, %s, CURRENT_TIMESTAMP)
-    ON CONFLICT (user_id, conversation_id)
-    DO UPDATE SET last_message_timestamp = EXCLUDED.last_message_timestamp
-    """
+TOUCH_CONVERSATION_STATEMENT = """
+    INSERT INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
+    VALUES (%s, %s, NULL, CURRENT_TIMESTAMP)
+    ON CONFLICT (user_id, conversation_id)
+    DO UPDATE SET last_message_timestamp = EXCLUDED.last_message_timestamp
+    """

Then update line 257 to use TOUCH_CONVERSATION_STATEMENT:

                 # Update or insert conversation record with last_message_timestamp
                 cursor.execute(
-                    PostgresCache.UPSERT_CONVERSATION_STATEMENT,
+                    PostgresCache.TOUCH_CONVERSATION_STATEMENT,
                     (user_id, conversation_id, None),
                 )

Likely an incorrect or invalid review comment.

src/cache/sqlite_cache.py (6)

9-9: LGTM!

The import of ConversationData is necessary for the new topic summary functionality and follows the absolute import coding guideline.


89-94: LGTM!

The SQL statement correctly retrieves conversation metadata ordered by timestamp.


169-170: LGTM!

The conversations table initialization follows the same pattern as the cache table and is correctly placed in the initialization flow.


285-290: LGTM!

The deletion correctly removes records from both the cache and conversations tables, maintaining data consistency.


297-329: LGTM!

The method correctly constructs ConversationData objects with proper column mapping and type annotations.


96-109: Eliminate duplicate SQL statement.

UPSERT_CONVERSATION_STATEMENT (lines 106-109) is identical to INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT (lines 96-99). This violates DRY principles.

Apply this diff to remove the duplication:

 INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
     INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
     VALUES (?, ?, ?, ?)
     """

-DELETE_CONVERSATION_STATEMENT = """
-    DELETE FROM conversations
-     WHERE user_id=? AND conversation_id=?
-    """
-
-UPSERT_CONVERSATION_STATEMENT = """
-    INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
-    VALUES (?, ?, ?, ?)
-    """
+DELETE_CONVERSATION_STATEMENT = """
+    DELETE FROM conversations
+     WHERE user_id=? AND conversation_id=?
+    """
+
+UPSERT_CONVERSATION_STATEMENT = INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT

Comment on lines +331 to +357
@connection
def set_topic_summary(
self,
user_id: str,
conversation_id: str,
topic_summary: str,
skip_user_id_check: bool = False,
) -> None:
"""Set the topic summary for the given conversation.
Args:
user_id: User identification.
conversation_id: Conversation ID unique for given user.
topic_summary: The topic summary to store.
skip_user_id_check: Skip user_id suid check.
"""
if self.connection is None:
logger.error("Cache is disconnected")
raise CacheError("set_topic_summary: cache is disconnected")

cursor = self.connection.cursor()
cursor.execute(
self.INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT,
(user_id, conversation_id, topic_summary, time()),
)
cursor.close()
self.connection.commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the timestamp update behavior in set_topic_summary.

When set_topic_summary is called, it updates last_message_timestamp to the current time (line 354). If this method is called separately from message insertion (e.g., after generating a summary for an existing conversation), it will update the timestamp even though no new message was added.

Consider whether:

  1. The timestamp should be retrieved from the existing conversation record rather than using time(), or
  2. The timestamp should be passed as a parameter to avoid updating it when only setting the topic summary

To verify the actual usage pattern:


🏁 Script executed:

#!/bin/bash
# Description: Check how set_topic_summary is called in the codebase

# Search for all calls to set_topic_summary to understand the calling pattern
rg -n -A 3 -B 3 '\.set_topic_summary\(' --type=py

Length of output: 4764


🏁 Script executed:

#!/bin/bash
# Find the INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT definition
rg -n "INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT\s*=" --type=py -A 5

Length of output: 981


🏁 Script executed:

#!/bin/bash
# Check insert_or_append to see if it updates last_message_timestamp
rg -n "def insert_or_append" --type=py -A 30 src/cache/sqlite_cache.py

Length of output: 1139


🏁 Script executed:

#!/bin/bash
# Check the schema or table creation to understand the conversations table structure
rg -n "CREATE TABLE.*conversations" --type=py -A 10

Length of output: 1436


🏁 Script executed:

#!/bin/bash
# Find INSERT_CONVERSATION_HISTORY_STATEMENT to see how it handles conversations table
rg -n "INSERT_CONVERSATION_HISTORY_STATEMENT\s*=" --type=py -A 15

Length of output: 2060


🏁 Script executed:

#!/bin/bash
# Search for triggers that might update conversations table
rg -n "CREATE TRIGGER" --type=py -A 10

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for any INSERT INTO conversations statements
rg -n "INSERT.*INTO conversations" --type=py -A 3

Length of output: 1383


🏁 Script executed:

#!/bin/bash
# Check the full insert_or_append implementation to see all SQL executions
rg -n "def insert_or_append" --type=py -A 50 src/cache/sqlite_cache.py | head -60

Length of output: 1780


Confirmed: set_topic_summary incorrectly overwrites last_message_timestamp.

In endpoints.py (lines 208-213), insert_or_append is called first, which sets last_message_timestamp to the actual message time. Immediately after, set_topic_summary is called, which uses INSERT OR REPLACE (line 97). This deletes the existing conversation row and inserts a new one with time() as the timestamp, overwriting the correct timestamp that was just set.

Fix: Modify INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT to only update topic_summary, not last_message_timestamp:

INSERT INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
VALUES (?, ?, ?, COALESCE((SELECT last_message_timestamp FROM conversations WHERE user_id=? AND conversation_id=?), ?))
ON CONFLICT (user_id, conversation_id)
DO UPDATE SET topic_summary = EXCLUDED.topic_summary

Or use a simpler UPDATE-only approach when the conversation already exists from insert_or_append.

🤖 Prompt for AI Agents
In src/cache/sqlite_cache.py around lines 331 to 357, set_topic_summary
currently uses an INSERT OR REPLACE that overwrites last_message_timestamp with
time(), erasing the correct timestamp set by insert_or_append; change the
behavior so topic updates do not modify last_message_timestamp — either alter
INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT to preserve the existing
last_message_timestamp (e.g. insert with COALESCE((SELECT last_message_timestamp
...), ?) and ON CONFLICT DO UPDATE SET topic_summary = EXCLUDED.topic_summary)
or replace the INSERT/REPLACE with logic that does an UPDATE-only when the
conversation exists and a separate INSERT that sets the correct timestamp when
it doesn’t; implement the chosen SQL change and remove any assignment of time()
to last_message_timestamp in set_topic_summary so it no longer overwrites the
true message time.

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/utils/endpoints.py (2)

146-157: Enhance docstring with Args and Returns sections.

The function correctly retrieves the topic summary prompt with appropriate fallback, but the docstring should follow Google Python style with Args and Returns sections.

Apply this diff:

 def get_topic_summary_system_prompt(config: AppConfig) -> str:
-    """Get the topic summary system prompt."""
+    """Get the topic summary system prompt.
+
+    Args:
+        config: Application configuration which may include a custom_profile with prompts.
+
+    Returns:
+        str: The resolved topic summary system prompt.
+    """
     # profile takes precedence for setting prompt
     if (
         config.customization is not None
         and config.customization.custom_profile is not None
     ):
         prompt = config.customization.custom_profile.get_prompts().get("topic_summary")
         if prompt:
             return prompt
 
     return constants.DEFAULT_TOPIC_SUMMARY_SYSTEM_PROMPT

Based on coding guidelines.


184-213: Consider renaming _skip_userid_check parameter.

The leading underscore in _skip_userid_check is unconventional for a function parameter and typically indicates a private/internal variable. Consider using skip_userid_check without the underscore for consistency with other methods in the codebase (e.g., line 339 in set_topic_summary).

Apply this diff:

 def store_conversation_into_cache(
     config: AppConfig,
     user_id: str,
     conversation_id: str,
     provider_id: str,
     model_id: str,
     query: str,
     response: str,
-    _skip_userid_check: bool,
+    skip_userid_check: bool,
     topic_summary: str | None,
 ) -> None:
     """Store one part of conversation into conversation history cache."""
     if config.conversation_cache_configuration.type is not None:
         cache = config.conversation_cache
         if cache is None:
             logger.warning("Conversation cache configured but not initialized")
             return
         cache_entry = CacheEntry(
             query=query,
             response=response,
             provider=provider_id,
             model=model_id,
         )
         cache.insert_or_append(
-            user_id, conversation_id, cache_entry, _skip_userid_check
+            user_id, conversation_id, cache_entry, skip_userid_check
         )
         if topic_summary and len(topic_summary) > 0:
             cache.set_topic_summary(
-                user_id, conversation_id, topic_summary, _skip_userid_check
+                user_id, conversation_id, topic_summary, skip_userid_check
             )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c91471a and a44bf7b.

📒 Files selected for processing (2)
  • src/cache/sqlite_cache.py (7 hunks)
  • src/utils/endpoints.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/utils/endpoints.py
  • src/cache/sqlite_cache.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/utils/endpoints.py
  • src/cache/sqlite_cache.py
🧬 Code graph analysis (2)
src/utils/endpoints.py (8)
src/models/config.py (2)
  • config (138-144)
  • get_prompts (442-444)
src/configuration.py (1)
  • customization (119-123)
src/cache/sqlite_cache.py (3)
  • get (181-215)
  • insert_or_append (218-260)
  • set_topic_summary (334-359)
src/cache/cache.py (3)
  • get (46-58)
  • insert_or_append (61-75)
  • set_topic_summary (106-120)
src/cache/in_memory_cache.py (3)
  • get (31-46)
  • insert_or_append (49-66)
  • set_topic_summary (105-121)
src/cache/postgres_cache.py (3)
  • get (184-217)
  • insert_or_append (220-263)
  • set_topic_summary (338-365)
src/cache/noop_cache.py (3)
  • get (29-44)
  • insert_or_append (47-64)
  • set_topic_summary (103-119)
src/utils/suid.py (1)
  • get_suid (6-16)
src/cache/sqlite_cache.py (3)
src/models/cache_entry.py (1)
  • ConversationData (22-33)
src/utils/connection_decorator.py (1)
  • connection (6-22)
src/cache/cache.py (1)
  • set_topic_summary (106-120)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (10)
src/cache/sqlite_cache.py (9)

9-9: LGTM!

The ConversationData import is correctly added to support the new return type for the list method.


53-61: LGTM!

The conversations table schema is well-designed with appropriate fields and a composite primary key for efficient lookups by user and conversation.


106-111: LGTM!

The UPSERT_CONVERSATION_STATEMENT correctly uses ON CONFLICT DO UPDATE to update only last_message_timestamp while preserving topic_summary.


171-172: LGTM!

The conversations table is correctly initialized during cache setup.


239-257: LGTM!

The timestamp is correctly captured once and used consistently for both the cache entry and conversation record. The use of UPSERT_CONVERSATION_STATEMENT ensures that topic_summary is preserved when updating last_message_timestamp.


287-292: LGTM!

Correctly deletes the conversation record to maintain consistency when removing conversation history.


299-331: LGTM!

The list method correctly returns ConversationData objects with all three fields populated from the conversations table.


96-99: Fix INSERT OR REPLACE to preserve last_message_timestamp.

INSERT OR REPLACE deletes the existing row and inserts a new one, which will overwrite last_message_timestamp when only updating topic_summary. This is problematic when set_topic_summary is called after insert_or_append has already set the correct timestamp.

Apply this diff to use ON CONFLICT instead:

-INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
-    INSERT OR REPLACE INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
-    VALUES (?, ?, ?, ?)
-    """
+INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
+    INSERT INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
+    VALUES (?, ?, ?, ?)
+    ON CONFLICT (user_id, conversation_id)
+    DO UPDATE SET topic_summary = excluded.topic_summary
+    """

333-359: set_topic_summary should only UPDATE topic_summary, not touch last_message_timestamp.

Even with the fix to INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT to use ON CONFLICT DO UPDATE SET topic_summary = excluded.topic_summary, passing time() at line 356 is a code smell. Since insert_or_append always runs before set_topic_summary (as seen in endpoints.py), the conversation record already exists and only needs its topic_summary field updated.

Apply this diff to use an UPDATE-only approach:

-    INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT = """
-        INSERT INTO conversations(user_id, conversation_id, topic_summary, last_message_timestamp)
-        VALUES (?, ?, ?, ?)
-        ON CONFLICT (user_id, conversation_id)
-        DO UPDATE SET topic_summary = excluded.topic_summary
-        """
+    UPDATE_TOPIC_SUMMARY_STATEMENT = """
+        UPDATE conversations
+        SET topic_summary = ?
+        WHERE user_id = ? AND conversation_id = ?
+        """
     def set_topic_summary(
         self,
         user_id: str,
         conversation_id: str,
         topic_summary: str,
         skip_user_id_check: bool = False,
     ) -> None:
         """Set the topic summary for the given conversation.
 
         Args:
             user_id: User identification.
             conversation_id: Conversation ID unique for given user.
             topic_summary: The topic summary to store.
             skip_user_id_check: Skip user_id suid check.
         """
         if self.connection is None:
             logger.error("Cache is disconnected")
             raise CacheError("set_topic_summary: cache is disconnected")
 
         cursor = self.connection.cursor()
         cursor.execute(
-            self.INSERT_OR_UPDATE_TOPIC_SUMMARY_STATEMENT,
-            (user_id, conversation_id, topic_summary, time()),
+            self.UPDATE_TOPIC_SUMMARY_STATEMENT,
+            (topic_summary, user_id, conversation_id),
         )
         cursor.close()
         self.connection.commit()
src/utils/endpoints.py (1)

312-342: Fix return tuple ordering and docstring to match get_agent.

The function returns (agent, session_id, conversation_id) which is inverted relative to get_agent's (agent, conversation_id, session_id). This inconsistency risks call-site mistakes where the returned values are unpacked in the wrong order. Additionally, the docstring incorrectly states the function returns two values when it returns three.

Apply this diff to match get_agent's ordering and fix the docstring:

 async def get_temp_agent(
     client: AsyncLlamaStackClient,
     model_id: str,
     system_prompt: str,
-) -> tuple[AsyncAgent, str, str]:
+) -> tuple[AsyncAgent, str, str]:
     """Create a temporary agent with new agent_id and session_id.
 
     This function creates a new agent without persistence, shields, or tools.
     Useful for temporary operations or one-off queries, such as validating a
     question or generating a summary.
+
     Args:
         client: The AsyncLlamaStackClient to use for the request.
         model_id: The ID of the model to use.
         system_prompt: The system prompt/instructions for the agent.
+
     Returns:
-        tuple[AsyncAgent, str]: A tuple containing the agent and session_id.
+        tuple[AsyncAgent, str, str]: A tuple of (agent, conversation_id, session_id).
     """
     logger.debug("Creating temporary agent")
     agent = AsyncAgent(
         client,  # type: ignore[arg-type]
         model=model_id,
         instructions=system_prompt,
         enable_session_persistence=False,  # Temporary agent doesn't need persistence
     )
     await agent.initialize()
 
     # Generate new IDs for the temporary agent
     conversation_id = agent.agent_id
     session_id = await agent.create_session(get_suid())
 
-    return agent, session_id, conversation_id
+    return agent, conversation_id, session_id

Important: This change will break existing call sites. Ensure all usages of get_temp_agent are updated to match the new ordering.

Run the following script to find all call sites:

#!/bin/bash
# Description: Find all calls to get_temp_agent to verify they handle the new tuple ordering.

rg -n -A 3 -B 1 'get_temp_agent\(' --type=py

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks nice, LGTM.
Actually with the new table with summary, the DB operations are much cleaner (the need to group by for list etc.)

Signed-off-by: Stephanie <yangcao@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
docs/openapi.json (1)

1424-1454: Clarify timestamp format and consider making topic_summary optional.

Two observations:

  1. Timestamp format ambiguity: Line 1442 defines last_message_timestamp as type number without specifying whether it represents Unix epoch seconds or milliseconds. Consider adding a description or format hint.

  2. Required nullable field: Lines 1447-1450 mark topic_summary as both required and nullable (anyOf with string or null). This means the field must always be present but can have a null value. If topic_summary is genuinely optional for new conversations (before summarization), consider removing it from the required array and keeping the nullable type.

Consider this adjustment for clarity:

         "last_message_timestamp": {
             "type": "number",
-            "title": "Last Message Timestamp"
+            "title": "Last Message Timestamp",
+            "description": "Unix timestamp in seconds of the last message"
         }
     },
     "type": "object",
     "required": [
         "conversation_id",
-        "topic_summary",
         "last_message_timestamp"
     ],
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a44bf7b and 17c6518.

📒 Files selected for processing (1)
  • docs/openapi.json (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (3)
docs/openapi.json (3)

1572-1594: LGTM!

The topic_summary field addition to ConversationDetails is well-structured:

  • Correctly marked as optional (not in required array)
  • Allows null values with anyOf type definition
  • Includes clear description and example
  • Description updated to reflect the new field

1655-1678: LGTM!

The ConversationsListResponse updates properly integrate topic_summary:

  • Description updated with comprehensive documentation including code example
  • Both example conversations include realistic topic_summary values
  • Consistent with the ConversationDetails schema updates

1680-1696: LGTM!

The ConversationsListResponseV2 schema correctly references the new ConversationData schema for its items. This creates a clean API contract where v2 returns a simplified data structure compared to v1's ConversationDetails.

Comment on lines +747 to +749
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": "2024-01-01T00:00:00Z"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix type inconsistency in example data.

The last_message_timestamp field is defined as type number in the ConversationData schema (line 1442), but the example shows a string value "2024-01-01T00:00:00Z". This inconsistency could mislead API consumers.

Apply this diff to use a numeric timestamp:

             "conversations": [
                 {
                     "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
                     "topic_summary": "This is a topic summary",
-                    "last_message_timestamp": "2024-01-01T00:00:00Z"
+                    "last_message_timestamp": 1704067200
                 }
             ]

Alternatively, if timestamps should be ISO 8601 strings throughout the API, update the schema definition at line 1442 to use "type": "string" with "format": "date-time".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": "2024-01-01T00:00:00Z"
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": 1704067200
🤖 Prompt for AI Agents
In docs/openapi.json around lines 747 to 749, the example for ConversationData
uses an ISO string for last_message_timestamp while the schema at line 1442
declares it as a number; update the example to use a numeric UNIX timestamp
(e.g., 1704067200) to match the schema, or if the API should use ISO 8601
strings instead, change the ConversationData schema at line 1442 to "type":
"string" and add "format": "date-time" so the example remains valid.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik changed the title [RHDHPAI-978]Topic summary of initial query [RHDHPAI-978] Topic summary of initial query Sep 30, 2025
@tisnik tisnik merged commit b5eb048 into lightspeed-core:main Sep 30, 2025
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants