Skip to content

Prevent Redundant API Calls via Query/URL Deduplication#12

Open
thanay-sisir wants to merge 5 commits intoPokee-AI:mainfrom
thanay-sisir:URL_deduplication
Open

Prevent Redundant API Calls via Query/URL Deduplication#12
thanay-sisir wants to merge 5 commits intoPokee-AI:mainfrom
thanay-sisir:URL_deduplication

Conversation

@thanay-sisir
Copy link

🚀 Query/URL Deduplication Feature

🎯 Executive Summary

I implemented a session-level deduplication mechanism to prevent redundant web searches and URL reads within the BaseDeepResearchAgent. This change significantly improves research performance, reduces unnecessary API costs, and speeds up response times.


⚠️ The Problem (Why)

  • Redundant Tool Calls: In multi-turn research, the agent frequently made the same search or read the same URL multiple times.
  • High Operational Cost: This resulted in wasted processing time, increased latency, and unnecessarily high usage costs for search and web scraping APIs.
  • Poor User Experience: The inefficiency slowed down the overall research loop, leading to longer wait times for the user.

🛠️ The Solution (How)

I introduced session-scoped state tracking using two Python set() objects: _seen_queries and _seen_urls.

  • Tracking: These sets are initialized at the start of a research session and automatically cleaned up afterward.
  • Execution Filter: Before executing any web_search or web_read tool call, I filter the input list (query_list or url_list) to remove any items already present in the respective tracking set.
  • Cache Mimicry: If the input list is entirely composed of duplicates, the tool call is skipped entirely, and a mock response is returned immediately to maintain consistent response format without incurring an API cost or latency penalty.
  • Proactive Update: The sets are updated before execution to handle all remaining unique items, ensuring future requests recognize them as seen.

✅ Key Benefits

  1. ⚡ Performance & Cost Savings: Eliminates redundant network and API calls, leading to reduced latency and significantly lower operational costs.
  2. 🚀 Faster Responses: The agent focuses on processing new information, making the overall research loops much faster for the end-user.
  3. 🔬 Clean Implementation: The change is non-invasive, adding only about 45 lines of focused code with no new dependencies, ensuring full backward compatibility.
  4. 🔒 Reliability: The session-scoped tracking prevents state leakage between different research sessions and uses efficient $O(1)$ set lookups for optimal performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments