feat(tests): improve search evaluation with keyword queries and robust rubrics #303

arabold · 2026-01-12T03:52:34Z

Summary

Switches search evaluation queries from natural language ("How do I...") to keyword-based ("useEffect usage") to better match tool usage patterns.
Refines evaluation rubrics to value "Richness" and documentation relevance over strict conciseness.
Fixes search-provider.ts and run-provider.sh to handle logging noise and output clean JSON for promptfoo.
Adds --no-clean option to the scraper to allow incremental indexing for test setup.
Updates package-lock.json with necessary dependencies.

Verification

Run npm run evaluate:search.
All 5/5 tests in dataset.yaml pass.

…t rubrics

github-actions · 2026-01-15T02:53:11Z

🎉 This PR is included in version 1.36.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

feat(tests): improve search evaluation with keyword queries and robus…

83cf652

…t rubrics

arabold force-pushed the feat/improve-search-eval branch from f098de4 to 83cf652 Compare January 12, 2026 04:10

arabold merged commit 052b088 into main Jan 12, 2026
2 of 3 checks passed

arabold deleted the feat/improve-search-eval branch January 12, 2026 04:16

github-actions bot added the released label Jan 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tests): improve search evaluation with keyword queries and robust rubrics #303

feat(tests): improve search evaluation with keyword queries and robust rubrics #303

Uh oh!

arabold commented Jan 12, 2026

Uh oh!

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(tests): improve search evaluation with keyword queries and robust rubrics #303

feat(tests): improve search evaluation with keyword queries and robust rubrics #303

Uh oh!

Conversation

arabold commented Jan 12, 2026

Summary

Verification

Uh oh!

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants