testing framework for transformRequest+Response#72
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
| { | ||
| "error": "400 {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"`max_tokens` must be greater than `thinking.budget_tokens`. Please consult our documentation at https://docs.claude.com/en/docs/build-with-claude/extended-thinking#max-tokens-and-context-window-size\"},\"request_id\":\"req_011CXasVyLu26rs4f6bS7DRJ\"}", | ||
| "name": "Error" | ||
| } No newline at end of file |
There was a problem hiding this comment.
only error i found so far, this seems to be valid since we define in our case max_tokens = 100 with high reasoning.
idk if we want to throw our own error or something. https://github.com/braintrustdata/lingua/blob/main/payloads/cases/simple.ts#L146
0cc799d to
6093a42
Compare
9e4d458 to
e383b62
Compare
6093a42 to
5c74eae
Compare
| map.insert( | ||
| "input_tokens_details".into(), | ||
| serde_json::json!({ "cached_tokens": self.prompt_cached_tokens.unwrap_or(0) }), | ||
| ); |
There was a problem hiding this comment.
caught this from the validate_response_json js binding. responses require input_token_details and output_token_details
| } | ||
| /* eslint-enable @typescript-eslint/consistent-type-assertions */ | ||
|
|
||
| const isParamCase = (name: string) => name.endsWith("Param"); |
There was a problem hiding this comment.
temporary, i didn't want to explode the diff so doing this here.
| @@ -0,0 +1,35 @@ | |||
| { | |||
There was a problem hiding this comment.
anthropic_to_chatcompletions means anthropic payload using a chat completions model - we save the actual chat completion response payload so we don't have to incur the LLM cost.
5c74eae to
0c166a9
Compare
e383b62 to
8d60e4b
Compare
0c166a9 to
d09967b
Compare
2342f75 to
5133b4d
Compare
d09967b to
13cb50c
Compare
| } catch (e) { | ||
| // Check if this is an expected error (known provider incompatibility) | ||
| const errorReason = transformErrors[pairKey]?.[caseName]; | ||
| if (errorReason) { |
There was a problem hiding this comment.
we should check that the errorReason matches the actual reason.
| // Explicitly skipped tests (add here only if intentionally not supported) | ||
| // Format: "source_to_target_caseName" | ||
| const SKIPPED_TESTS = new Set<string>([ | ||
| // Add entries here with comments explaining why |
There was a problem hiding this comment.
do we need that if the list is currently empty?

This PR adds testing framework for transformRequest and transformResponse. This is specifically an offline testing framework where we test lingua transformations against saved snapshots.
We use the existing cases and WASM bindings generated from previous PR
transform_request,transform_responseandvalidate_*_jsonto verify valid transforms and no regressions.High level, my mental model is:
coverage-reportensures internal consistency between the Universal model & all the providerstransformsensures external compatibility with OpenAPI schema validations during every transform and using the actual SDK post-transform.Diagram:
Phase 1: Capture (one-time, requires API keys) getCaseForProvider(caseName, source) │ ▼ ┌─────────────────────────────────────┐ │ transformAndValidateRequest() │ │ • transform_request() ────────────┼──► validate_*_request() └──────────────────┬──────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ callProvider(target, request) │ │ • openai.chat.completions.create()│ │ • anthropic.messages.create() │ └──────────────────┬──────────────────┘ │ ▼ validate_*_response() │ ▼ writeFileSync(path, response) transforms/{src}_to_{tgt}/{case}.json Phase 2: Test (CI, no API calls) getCaseForProvider(caseName, source) │ ▼ ┌─────────────────────────────────────┐ │ transformAndValidateRequest() │ │ • transform_request() ────────────┼──► validate_*_request() └──────────────────┬──────────────────┘ │ ▼ toMatchSnapshot("request") │ ▼ ┌─────────────────────────────────────┐ │ loadAndValidateResponse() │ │ • readFileSync(path) ─────────────┼──► validate_*_response() └──────────────────┬──────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ transformResponseData() │ │ • transform_response() ───────────┼──► validate_*_response() └──────────────────┬──────────────────┘ │ ▼ toMatchSnapshot("response")