Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 57 additions & 7 deletions docs/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -744,7 +744,9 @@
},
"conversations": [
{
"conversation_id": "123e4567-e89b-12d3-a456-426614174000"
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": "2024-01-01T00:00:00Z"
Comment on lines +747 to +749
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix type inconsistency in example data.

The last_message_timestamp field is defined as type number in the ConversationData schema (line 1442), but the example shows a string value "2024-01-01T00:00:00Z". This inconsistency could mislead API consumers.

Apply this diff to use a numeric timestamp:

             "conversations": [
                 {
                     "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
                     "topic_summary": "This is a topic summary",
-                    "last_message_timestamp": "2024-01-01T00:00:00Z"
+                    "last_message_timestamp": 1704067200
                 }
             ]

Alternatively, if timestamps should be ISO 8601 strings throughout the API, update the schema definition at line 1442 to use "type": "string" with "format": "date-time".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": "2024-01-01T00:00:00Z"
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": 1704067200
🤖 Prompt for AI Agents
In docs/openapi.json around lines 747 to 749, the example for ConversationData
uses an ISO string for last_message_timestamp while the schema at line 1442
declares it as a number; update the example to use a numeric UNIX timestamp
(e.g., 1704067200) to match the schema, or if the API should use ISO 8601
strings instead, change the ConversationData schema at line 1442 to "type":
"string" and add "format": "date-time" so the example remains valid.

}
]
}
Expand Down Expand Up @@ -1419,6 +1421,37 @@
"title": "ConversationCacheConfiguration",
"description": "Conversation cache configuration."
},
"ConversationData": {
"properties": {
"conversation_id": {
"type": "string",
"title": "Conversation Id"
},
"topic_summary": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Topic Summary"
},
"last_message_timestamp": {
"type": "number",
"title": "Last Message Timestamp"
}
},
"type": "object",
"required": [
"conversation_id",
"topic_summary",
"last_message_timestamp"
],
"title": "ConversationData",
"description": "Model representing conversation data returned by cache list operations.\n\nAttributes:\n conversation_id: The conversation ID\n topic_summary: The topic summary for the conversation (can be None)\n last_message_timestamp: The timestamp of the last message in the conversation"
},
"ConversationDeleteResponse": {
"properties": {
"conversation_id": {
Expand Down Expand Up @@ -1536,14 +1569,29 @@
"openai",
"gemini"
]
},
"topic_summary": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Topic Summary",
"description": "Topic summary for the conversation",
"examples": [
"Openshift Microservices Deployment Strategies"
]
}
},
"type": "object",
"required": [
"conversation_id"
],
"title": "ConversationDetails",
"description": "Model representing the details of a user conversation.\n\nAttributes:\n conversation_id: The conversation ID (UUID).\n created_at: When the conversation was created.\n last_message_at: When the last message was sent.\n message_count: Number of user messages in the conversation.\n last_used_model: The last model used for the conversation.\n last_used_provider: The provider of the last used model.\n\nExample:\n ```python\n conversation = ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\"\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\",\n )\n ```"
"description": "Model representing the details of a user conversation.\n\nAttributes:\n conversation_id: The conversation ID (UUID).\n created_at: When the conversation was created.\n last_message_at: When the last message was sent.\n message_count: Number of user messages in the conversation.\n last_used_model: The last model used for the conversation.\n last_used_provider: The provider of the last used model.\n topic_summary: The topic summary for the conversation.\n\nExample:\n ```python\n conversation = ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\"\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\",\n topic_summary=\"Openshift Microservices Deployment Strategies\",\n )\n ```"
},
"ConversationResponse": {
"properties": {
Expand Down Expand Up @@ -1604,7 +1652,7 @@
"conversations"
],
"title": "ConversationsListResponse",
"description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation details associated with the user.\n\nExample:\n ```python\n conversations_list = ConversationsListResponse(\n conversations=[\n ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\",\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\",\n ),\n ConversationDetails(\n conversation_id=\"456e7890-e12b-34d5-a678-901234567890\"\n created_at=\"2024-01-01T01:00:00Z\",\n message_count=2,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\",\n )\n ]\n )\n ```",
"description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation details associated with the user.\n\nExample:\n ```python\n conversations_list = ConversationsListResponse(\n conversations=[\n ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\",\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\",\n topic_summary=\"Openshift Microservices Deployment Strategies\",\n ),\n ConversationDetails(\n conversation_id=\"456e7890-e12b-34d5-a678-901234567890\"\n created_at=\"2024-01-01T01:00:00Z\",\n message_count=2,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\",\n topic_summary=\"RHDH Purpose Summary\",\n )\n ]\n )\n ```",
"examples": [
{
"conversations": [
Expand All @@ -1614,14 +1662,16 @@
"last_message_at": "2024-01-01T00:05:00Z",
"last_used_model": "gemini/gemini-2.0-flash",
"last_used_provider": "gemini",
"message_count": 5
"message_count": 5,
"topic_summary": "Openshift Microservices Deployment Strategies"
},
{
"conversation_id": "456e7890-e12b-34d5-a678-901234567890",
"created_at": "2024-01-01T01:00:00Z",
"last_used_model": "gemini/gemini-2.5-flash",
"last_used_provider": "gemini",
"message_count": 2
"message_count": 2,
"topic_summary": "RHDH Purpose Summary"
}
]
}
Expand All @@ -1631,7 +1681,7 @@
"properties": {
"conversations": {
"items": {
"type": "string"
"$ref": "#/components/schemas/ConversationData"
},
"type": "array",
"title": "Conversations"
Expand All @@ -1642,7 +1692,7 @@
"conversations"
],
"title": "ConversationsListResponseV2",
"description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation IDs associated with the user."
"description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation data associated with the user."
},
"CustomProfile": {
"properties": {
Expand Down
1 change: 1 addition & 0 deletions src/app/endpoints/conversations.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@ async def get_conversations_list_endpoint_handler(
message_count=conv.message_count,
last_used_model=conv.last_used_model,
last_used_provider=conv.last_used_provider,
topic_summary=conv.topic_summary,
)
for conv in user_conversations
]
Expand Down
21 changes: 17 additions & 4 deletions src/app/endpoints/conversations_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@
"conversations": [
{
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"topic_summary": "This is a topic summary",
"last_message_timestamp": "2024-01-01T00:00:00Z",
}
]
}
Expand All @@ -102,6 +104,8 @@ async def get_conversations_list_endpoint_handler(

logger.info("Retrieving conversations for user %s", user_id)

skip_userid_check = auth[2]

if configuration.conversation_cache is None:
logger.warning("Converastion cache is not configured")
raise HTTPException(
Expand All @@ -112,7 +116,7 @@ async def get_conversations_list_endpoint_handler(
},
)

conversations = configuration.conversation_cache.list(user_id, False)
conversations = configuration.conversation_cache.list(user_id, skip_userid_check)
logger.info("Conversations for user %s: %s", user_id, len(conversations))

return ConversationsListResponseV2(conversations=conversations)
Expand All @@ -132,6 +136,8 @@ async def get_conversation_endpoint_handler(
user_id = auth[0]
logger.info("Retrieving conversation %s for user %s", conversation_id, user_id)

skip_userid_check = auth[2]

if configuration.conversation_cache is None:
logger.warning("Converastion cache is not configured")
raise HTTPException(
Expand All @@ -144,7 +150,9 @@ async def get_conversation_endpoint_handler(

check_conversation_existence(user_id, conversation_id)

conversation = configuration.conversation_cache.get(user_id, conversation_id, False)
conversation = configuration.conversation_cache.get(
user_id, conversation_id, skip_userid_check
)
chat_history = [transform_chat_message(entry) for entry in conversation]

return ConversationResponse(
Expand All @@ -168,6 +176,8 @@ async def delete_conversation_endpoint_handler(
user_id = auth[0]
logger.info("Deleting conversation %s for user %s", conversation_id, user_id)

skip_userid_check = auth[2]

if configuration.conversation_cache is None:
logger.warning("Converastion cache is not configured")
raise HTTPException(
Expand All @@ -181,7 +191,9 @@ async def delete_conversation_endpoint_handler(
check_conversation_existence(user_id, conversation_id)

logger.info("Deleting conversation %s for user %s", conversation_id, user_id)
deleted = configuration.conversation_cache.delete(user_id, conversation_id, False)
deleted = configuration.conversation_cache.delete(
user_id, conversation_id, skip_userid_check
)

if deleted:
return ConversationDeleteResponse(
Expand Down Expand Up @@ -215,7 +227,8 @@ def check_conversation_existence(user_id: str, conversation_id: str) -> None:
if configuration.conversation_cache is None:
return
conversations = configuration.conversation_cache.list(user_id, False)
if conversation_id not in conversations:
conversation_ids = [conv.conversation_id for conv in conversations]
if conversation_id not in conversation_ids:
logger.error("No conversation found for conversation ID %s", conversation_id)
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
Expand Down
59 changes: 56 additions & 3 deletions src/app/endpoints/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@
from utils.endpoints import (
check_configuration_loaded,
get_agent,
get_topic_summary_system_prompt,
get_temp_agent,
get_system_prompt,
store_conversation_into_cache,
validate_conversation_ownership,
Expand Down Expand Up @@ -98,7 +100,11 @@ def is_transcripts_enabled() -> bool:


def persist_user_conversation_details(
user_id: str, conversation_id: str, model: str, provider_id: str
user_id: str,
conversation_id: str,
model: str,
provider_id: str,
topic_summary: Optional[str],
) -> None:
Comment on lines +103 to 108
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against None for non-null DB column topic_summary.

UserConversation.topic_summary is defined as non-null with default="". Passing None here can violate NOT NULL. Normalize to empty string at creation and modernize the type hint.

Apply:

-def persist_user_conversation_details(
-    user_id: str,
-    conversation_id: str,
-    model: str,
-    provider_id: str,
-    topic_summary: Optional[str],
-) -> None:
+def persist_user_conversation_details(
+    user_id: str,
+    conversation_id: str,
+    model: str,
+    provider_id: str,
+    topic_summary: str | None,
+) -> None:
@@
-            conversation = UserConversation(
+            conversation = UserConversation(
                 id=conversation_id,
                 user_id=user_id,
                 last_used_model=model,
                 last_used_provider=provider_id,
-                topic_summary=topic_summary,
+                topic_summary=topic_summary or "",
                 message_count=1,
             )

Also applies to: 116-123

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 103-108 (and also apply same change
to lines 116-123), the function accepts topic_summary: Optional[str> but
UserConversation.topic_summary is NOT NULL with default "". Normalize the value
before creating the DB object by coalescing None to an empty string (e.g.,
topic_summary = topic_summary or "") and update the type hint to str (or keep
Optional in public API but convert immediately to str internally) so the DB
always receives a non-null string.

"""Associate conversation to user in the database."""
with get_session() as session:
Expand All @@ -112,6 +118,7 @@ def persist_user_conversation_details(
user_id=user_id,
last_used_model=model,
last_used_provider=provider_id,
topic_summary=topic_summary,
message_count=1,
)
session.add(conversation)
Expand Down Expand Up @@ -169,9 +176,42 @@ def evaluate_model_hints(
return model_id, provider_id


async def get_topic_summary(
question: str, client: AsyncLlamaStackClient, model_id: str
) -> str:
"""Get a topic summary for a question.

Args:
question: The question to be validated.
client: The AsyncLlamaStackClient to use for the request.
model_id: The ID of the model to use.
Returns:
str: The topic summary for the question.
"""
topic_summary_system_prompt = get_topic_summary_system_prompt(configuration)
agent, session_id, _ = await get_temp_agent(
client, model_id, topic_summary_system_prompt
)
response = await agent.create_turn(
messages=[UserMessage(role="user", content=question)],
session_id=session_id,
stream=False,
toolgroups=None,
)
response = cast(Turn, response)
return (
interleaved_content_as_str(response.output_message.content)
if (
getattr(response, "output_message", None) is not None
and getattr(response.output_message, "content", None) is not None
)
else ""
)


@router.post("/query", responses=query_response)
@authorize(Action.QUERY)
async def query_endpoint_handler(
async def query_endpoint_handler( # pylint: disable=R0914
request: Request,
query_request: QueryRequest,
auth: Annotated[AuthTuple, Depends(auth_dependency)],
Expand Down Expand Up @@ -200,7 +240,7 @@ async def query_endpoint_handler(
# log Llama Stack configuration
logger.info("Llama stack config: %s", configuration.llama_stack_configuration)

user_id, _, _, token = auth
user_id, _, _skip_userid_check, token = auth

user_conversation: UserConversation | None = None
if query_request.conversation_id:
Expand Down Expand Up @@ -251,6 +291,16 @@ async def query_endpoint_handler(
# Update metrics for the LLM call
metrics.llm_calls_total.labels(provider_id, model_id).inc()

# Get the initial topic summary for the conversation
topic_summary = None
with get_session() as session:
existing_conversation = (
session.query(UserConversation).filter_by(id=conversation_id).first()
)
if not existing_conversation:
topic_summary = await get_topic_summary(
query_request.query, client, model_id
)
Comment on lines +294 to +303
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use llama_stack_model_id instead of model_id for topic summary generation.

The code passes model_id (line 302) to get_topic_summary, which then calls get_temp_agent. However, model_id at this point is the short model label (e.g., "llama3"), not the full Llama Stack identifier. get_temp_agent creates an AsyncAgent with the model parameter, which expects the full provider/model format (e.g., "provider/llama3").

Apply this diff:

         # Get the initial topic summary for the conversation
         topic_summary = None
         with get_session() as session:
             existing_conversation = (
                 session.query(UserConversation).filter_by(id=conversation_id).first()
             )
             if not existing_conversation:
                 topic_summary = await get_topic_summary(
-                    query_request.query, client, model_id
+                    query_request.query, client, llama_stack_model_id
                 )

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 294 to 303, the code calls
get_topic_summary with model_id (the short label) which is incorrect because
get_temp_agent expects the full Llama Stack identifier; replace the model_id
argument with llama_stack_model_id when calling get_topic_summary, ensuring
llama_stack_model_id is in scope (or compute/lookup it beforehand) so
get_temp_agent/AsyncAgent receives the full provider/model string.

# Convert RAG chunks to dictionary format once for reuse
logger.info("Processing RAG chunks...")
rag_chunks_dict = [chunk.model_dump() for chunk in summary.rag_chunks]
Expand Down Expand Up @@ -278,6 +328,7 @@ async def query_endpoint_handler(
conversation_id=conversation_id,
model=model_id,
provider_id=provider_id,
topic_summary=topic_summary,
)

store_conversation_into_cache(
Expand All @@ -288,6 +339,8 @@ async def query_endpoint_handler(
model_id,
query_request.query,
summary.llm_response,
_skip_userid_check,
topic_summary,
)

# Convert tool calls to response format
Expand Down
Loading
Loading