-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Test Warnings Report - 2026-01-25
Date: 2026-01-25
Tests with warnings: 41
Generated: 2026-01-26 14:36:29
Note: For detailed error messages and stack traces, check the log files in scripts/logs/ with date prefix 20260125.
Add GitHub issue links in the Ticket column for tracking.
| Test ID | Benchmark | Provider | Model | Severity | Warning Codes | Ticket | Action | Completed |
|---|---|---|---|---|---|---|---|---|
| T0259 | bibliographic_data | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0260 | business_letters | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0261 | business_letters | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0262 | business_letters | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0263 | fraktur_adverts | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0265 | bibliographic_data | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0266 | business_letters | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0267 | business_letters | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0268 | business_letters | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0269 | fraktur_adverts | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0270 | library_cards | openrouter | x-ai/grok-4 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0300 | medieval_manuscripts | openrouter | qwen/qwen3-vl-8b-thinking | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0301 | medieval_manuscripts | openrouter | meta-llama/llama-4-maverick | 🟠 High | ZERO_COST | |||
| T0302 | medieval_manuscripts | openrouter | qwen/qwen3-vl-30b-a3b-instruct | 🟠 High | ZERO_COST | |||
| T0303 | medieval_manuscripts | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0304 | medieval_manuscripts | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0332 | blacklist_cards | openrouter | qwen/qwen3-vl-8b-thinking | 🟠 High | ZERO_COST | |||
| T0333 | blacklist_cards | openrouter | meta-llama/llama-4-maverick | 🟠 High | ZERO_COST | |||
| T0334 | blacklist_cards | openrouter | qwen/qwen3-vl-30b-a3b-instruct | 🟠 High | ZERO_COST | |||
| T0335 | blacklist_cards | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0393 | company_lists | openrouter | qwen/qwen3-vl-8b-thinking | 🟠 High | ZERO_COST | |||
| T0394 | company_lists | openrouter | qwen/qwen3-vl-8b-thinking | 🟠 High | ZERO_COST | |||
| T0397 | company_lists | openrouter | qwen/qwen3-vl-30b-a3b-instruct | 🟠 High | ZERO_COST | |||
| T0399 | company_lists | openrouter | qwen/qwen3-vl-8b-instruct | 🟠 High | ZERO_COST | |||
| T0401 | company_lists | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0402 | company_lists | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0416 | business_letters | genai | gemini-3-pro-preview | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0477 | book_advert_xml | openrouter | qwen/qwen3-vl-8b-thinking | 🟠 High | ZERO_COST | |||
| T0479 | book_advert_xml | openrouter | qwen/qwen3-vl-30b-a3b-instruct | 🟠 High | ZERO_COST | |||
| T0481 | book_advert_xml | openrouter | x-ai/grok-4 | 🟠 High | ZERO_COST | |||
| T0547 | fraktur_adverts | mistral | ministral-14b-2512 | 🟡 Medium | ZERO_SCORE | |||
| T0549 | medieval_manuscripts | mistral | ministral-14b-2512 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0556 | business_letters | mistral | ministral-8b-2512 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0557 | business_letters | mistral | ministral-8b-2512 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0558 | fraktur_adverts | mistral | ministral-8b-2512 | 🟡 Medium | ZERO_SCORE | |||
| T0565 | bibliographic_data | mistral | magistral-small-2509 | 🟡 Medium | ZERO_SCORE | |||
| T0566 | business_letters | mistral | magistral-small-2509 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0567 | business_letters | mistral | magistral-small-2509 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0568 | business_letters | mistral | magistral-small-2509 | 🔴 Critical | ZERO_COST, ALL_NA | |||
| T0569 | fraktur_adverts | mistral | magistral-small-2509 | 🟡 Medium | ZERO_SCORE | |||
| T0571 | medieval_manuscripts | mistral | magistral-small-2509 | 🔴 Critical | ZERO_COST, ALL_NA |
Warning Codes Explanation
| Code | Severity | Description |
|---|---|---|
| ZERO_COST | 🟠 High | Total cost is $0 (pricing issue) |
| ALL_NA | 🔴 Critical | All metrics are N/A (scoring failed) |
| ZERO_SCORE | 🟡 Medium | Score is 0 (exceptionally bad performance) |
| ZERO_ITEMS | 🔴 Critical | No items processed |
| ZERO_DURATION | 🟠 High | No timing captured |
Action Required
- Review all critical issues
- Investigate high priority issues
- Check medium priority issues as time permits
- Re-run failed tests if needed
- Update pricing data if ZERO_COST warnings present
- Fix scoring issues if ALL_NA or ZERO_SCORE warnings present
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working