Skip to content

testing framework for transformRequest+Response#72

Open
knjiang wants to merge 1 commit into01-27-request_typescript_and_python_bindingsfrom
01-28-testing_framework_for_transformrequest_response
Open

testing framework for transformRequest+Response#72
knjiang wants to merge 1 commit into01-27-request_typescript_and_python_bindingsfrom
01-28-testing_framework_for_transformrequest_response

Conversation

@knjiang
Copy link
Contributor

@knjiang knjiang commented Jan 29, 2026

This PR adds testing framework for transformRequest and transformResponse. This is specifically an offline testing framework where we test lingua transformations against saved snapshots.

We use the existing cases and WASM bindings generated from previous PR transform_request, transform_response and validate_*_json to verify valid transforms and no regressions.

High level, my mental model is:

  • coverage-report ensures internal consistency between the Universal model & all the providers
  • transforms ensures external compatibility with OpenAPI schema validations during every transform and using the actual SDK post-transform.

Diagram:

  Phase 1: Capture (one-time, requires API keys)                                                                                
                                                                                                                                
  getCaseForProvider(caseName, source)                                                                                          
             │                                                                                                                  
             ▼                                                                                                                  
  ┌─────────────────────────────────────┐                                                                                       
  │   transformAndValidateRequest()     │                                                                                       
  │   • transform_request() ────────────┼──► validate_*_request()                                                               
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
  ┌─────────────────────────────────────┐                                                                                       
  │   callProvider(target, request)     │                                                                                       
  │   • openai.chat.completions.create()│                                                                                       
  │   • anthropic.messages.create()     │                                                                                       
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
           validate_*_response()                                                                                                
                     │                                                                                                          
                     ▼                                                                                                          
           writeFileSync(path, response)                                                                                        
           transforms/{src}_to_{tgt}/{case}.json                                                                                
                                                                                                                                
  Phase 2: Test (CI, no API calls)                                                                                              
                                                                                                                                
  getCaseForProvider(caseName, source)                                                                                          
             │                                                                                                                  
             ▼                                                                                                                  
  ┌─────────────────────────────────────┐                                                                                       
  │   transformAndValidateRequest()     │                                                                                       
  │   • transform_request() ────────────┼──► validate_*_request()                                                               
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
             toMatchSnapshot("request")                                                                                         
                     │                                                                                                          
                     ▼                                                                                                          
  ┌─────────────────────────────────────┐                                                                                       
  │   loadAndValidateResponse()         │                                                                                       
  │   • readFileSync(path) ─────────────┼──► validate_*_response()                                                              
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
  ┌─────────────────────────────────────┐                                                                                       
  │   transformResponseData()           │                                                                                       
  │   • transform_response() ───────────┼──► validate_*_response()                                                              
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
             toMatchSnapshot("response")                                                                                        
                                                                                                                                

Copy link
Contributor Author

knjiang commented Jan 29, 2026

Comment on lines +1 to +4
{
"error": "400 {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"`max_tokens` must be greater than `thinking.budget_tokens`. Please consult our documentation at https://docs.claude.com/en/docs/build-with-claude/extended-thinking#max-tokens-and-context-window-size\"},\"request_id\":\"req_011CXasVyLu26rs4f6bS7DRJ\"}",
"name": "Error"
} No newline at end of file
Copy link
Contributor Author

@knjiang knjiang Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only error i found so far, this seems to be valid since we define in our case max_tokens = 100 with high reasoning.

idk if we want to throw our own error or something. https://github.com/braintrustdata/lingua/blob/main/payloads/cases/simple.ts#L146

@knjiang knjiang force-pushed the 01-28-testing_framework_for_transformrequest_response branch from 0cc799d to 6093a42 Compare January 29, 2026 02:50
@knjiang knjiang force-pushed the 01-27-request_typescript_and_python_bindings branch 2 times, most recently from 9e4d458 to e383b62 Compare January 29, 2026 04:07
@knjiang knjiang force-pushed the 01-28-testing_framework_for_transformrequest_response branch from 6093a42 to 5c74eae Compare January 29, 2026 04:07
Comment on lines +327 to +330
map.insert(
"input_tokens_details".into(),
serde_json::json!({ "cached_tokens": self.prompt_cached_tokens.unwrap_or(0) }),
);
Copy link
Contributor Author

@knjiang knjiang Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caught this from the validate_response_json js binding. responses require input_token_details and output_token_details

}
/* eslint-enable @typescript-eslint/consistent-type-assertions */

const isParamCase = (name: string) => name.endsWith("Param");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temporary, i didn't want to explode the diff so doing this here.

@knjiang knjiang marked this pull request as ready for review January 29, 2026 05:04
@@ -0,0 +1,35 @@
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anthropic_to_chatcompletions means anthropic payload using a chat completions model - we save the actual chat completion response payload so we don't have to incur the LLM cost.

@knjiang knjiang force-pushed the 01-27-request_typescript_and_python_bindings branch from 2342f75 to 5133b4d Compare February 3, 2026 21:39
@knjiang knjiang force-pushed the 01-28-testing_framework_for_transformrequest_response branch from d09967b to 13cb50c Compare February 3, 2026 21:39
} catch (e) {
// Check if this is an expected error (known provider incompatibility)
const errorReason = transformErrors[pairKey]?.[caseName];
if (errorReason) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should check that the errorReason matches the actual reason.

// Explicitly skipped tests (add here only if intentionally not supported)
// Format: "source_to_target_caseName"
const SKIPPED_TESTS = new Set<string>([
// Add entries here with comments explaining why
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need that if the list is currently empty?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants