Skip to content

Test Warnings Report - 2026-01-26 #87

@MHindermann

Description

@MHindermann

Test Warnings Report - 2026-01-26

Date: 2026-01-26
Tests with warnings: 11
Generated: 2026-01-27 09:09:05


Note: For detailed error messages and stack traces, check the log files in scripts/logs/ with date prefix 20260126.

Add GitHub issue links in the Ticket column for tracking.

Test ID Benchmark Provider Model Severity Warning Codes Ticket Action Completed
T0023 business_letters mistral pixtral-large-2411 🔴 Critical ZERO_COST, ALL_NA
T0109 business_letters openai gpt-5 🔴 Critical ZERO_COST, ALL_NA rerun & delete 2026-02-17
T0113 business_letters openai gpt-5-mini 🔴 Critical ZERO_COST, ALL_NA rerun & delete 2026-02-17
T0166 library_cards openai gpt-5-mini 🔴 Critical ZERO_COST, ALL_NA rerun & delete old 2026-02-17
T0244 business_letters openrouter qwen/qwen3-vl-8b-thinking 🔴 Critical ZERO_COST, ALL_NA
T0245 business_letters openrouter qwen/qwen3-vl-8b-thinking 🔴 Critical ZERO_COST, ALL_NA
T0276 medieval_manuscripts openai gpt-4o-mini 🔴 Critical ZERO_COST, ALL_NA add graceful error handling to llm client test rerun on 2026-02-17
T0278 medieval_manuscripts openai gpt-4.1-nano 🔴 Critical ZERO_COST, ALL_NA rerun & delete 2026-02-17
T0421 medieval_manuscripts genai gemini-3-pro-preview 🔴 Critical ZERO_COST, ALL_NA
T0555 business_letters mistral ministral-8b-2512 🔴 Critical ZERO_COST, ALL_NA
T0560 medieval_manuscripts mistral ministral-8b-2512 🔴 Critical ZERO_COST, ALL_NA

Warning Codes Explanation

Code Severity Description
ZERO_COST 🟠 High Total cost is $0 (pricing issue)
ALL_NA 🔴 Critical All metrics are N/A (scoring failed)
ZERO_SCORE 🟡 Medium Score is 0 (exceptionally bad performance)
ZERO_ITEMS 🔴 Critical No items processed
ZERO_DURATION 🟠 High No timing captured

Action Required

  • Review all critical issues
  • Investigate high priority issues
  • Check medium priority issues as time permits
  • Re-run failed tests if needed
  • Update pricing data if ZERO_COST warnings present
  • Fix scoring issues if ALL_NA or ZERO_SCORE warnings present

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions