fix: deduplication fallback to message ID when request ID is missing #775

jaried · 2025-12-30T12:26:05Z

Summary

Some API proxies may not include requestId in the usage data. Previously, entries without requestId were not deduplicated at all (returned null from createUniqueHash), leading to duplicate counting.

This change makes the deduplication logic fall back to using only messageId when requestId is not available, ensuring proper deduplication for all API proxy types.

Changes

Modified createUniqueHash to return messageId when requestId is missing (instead of null)
Updated corresponding test case

Test Plan

Updated unit test for the new behavior
Verified with real data from API proxies that don't include requestId

Summary by CodeRabbit

Bug Fixes
- Improved usage data processing to handle incomplete data entries more gracefully. The system can now generate identifiers even when certain fields are unavailable, increasing overall robustness and reliability.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Some API proxies may not include requestId in the usage data. Previously, entries without requestId were not deduplicated at all, leading to duplicate counting. This change makes the deduplication logic fall back to using only messageId when requestId is not available, ensuring proper deduplication for all API proxy types.

coderabbitai · 2025-12-30T12:26:16Z

📝 Walkthrough

Walkthrough

The createUniqueHash() function in data-loader.ts was modified to implement a fallback mechanism. When messageId is present but requestId is missing, the function now returns the messageId alone instead of returning null. Tests were updated to align with this revised behavior.

Changes

Cohort / File(s)	Summary
Hash Generation Logic `apps/ccusage/src/data-loader.ts`	Modified `createUniqueHash()` to add fallback: returns `messageId` when `requestId` is absent (previously returned null). Test expectations updated to reflect new behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

fix: implement chronological deduplication for branched conversations #58 — Modifies the same createUniqueHash implementation to adjust how messageId and requestId are combined/handled.

Suggested reviewers

ryoppippi

Poem

🐰 A hash that once stumbled when requests went missing,
Now bounces along with just IDs—no dismissing!
With fallback in place, no more empty returns,
Our graceful little function learns and learns! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately describes the main change: modifying the deduplication logic to fall back to message ID when request ID is missing.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

apps/ccusage/src/data-loader.ts (1)

4263-4277: Test updated correctly, but consider adding integration test.

The unit test properly validates the new fallback behavior. However, I don't see an integration test that verifies deduplication actually works when entries have the same messageId but no requestId.

Suggested integration test to validate the deduplication fallback

Consider adding a test like this in the deduplication functionality describe block:

it('should deduplicate entries with same message ID when request ID is missing', async () => {
	await using fixture = await createFixture({
		projects: {
			project1: {
				session1: {
					'file1.jsonl': JSON.stringify({
						timestamp: '2025-01-10T10:00:00Z',
						message: {
							id: 'msg_123',
							usage: {
								input_tokens: 100,
								output_tokens: 50,
							},
						},
						// No requestId
						costUSD: 0.001,
					}),
				},
				session2: {
					'file2.jsonl': JSON.stringify({
						timestamp: '2025-01-15T10:00:00Z',
						message: {
							id: 'msg_123',  // Same messageId
							usage: {
								input_tokens: 100,
								output_tokens: 50,
							},
						},
						// No requestId
						costUSD: 0.001,
					}),
				},
			},
		},
	});

	const data = await loadDailyUsageData({
		claudePath: fixture.path,
		mode: 'display',
	});

	// Should only have one entry due to messageId-based deduplication
	expect(data).toHaveLength(1);
	expect(data[0]?.date).toBe('2025-01-10');
	expect(data[0]?.inputTokens).toBe(100);
	expect(data[0]?.totalCost).toBe(0.001);
});

This would validate that the fallback deduplication mechanism works end-to-end for API proxies that omit requestId.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69fabf2 and 30f53ce.

📒 Files selected for processing (1)

apps/ccusage/src/data-loader.ts

🧰 Additional context used

📓 Path-based instructions (7)

apps/ccusage/src/**/*.ts

📄 CodeRabbit inference engine (apps/ccusage/CLAUDE.md)

apps/ccusage/src/**/*.ts: Write tests in-source using if (import.meta.vitest != null) blocks instead of separate test files
Use Vitest globals (describe, it, expect) without imports in test blocks
In tests, use current Claude 4 models (sonnet-4, opus-4)
Use fs-fixture with createFixture() to simulate Claude data in tests
Only export symbols that are actually used by other modules
Do not use console.log; use the logger utilities from src/logger.ts instead

Files:

apps/ccusage/src/data-loader.ts

apps/ccusage/**/*.ts

📄 CodeRabbit inference engine (apps/ccusage/CLAUDE.md)

apps/ccusage/**/*.ts: NEVER use await import() dynamic imports anywhere (especially in tests)
Prefer @praha/byethrow Result type for error handling instead of try-catch
Use .ts extensions for local imports (e.g., import { foo } from './utils.ts')

Files:

apps/ccusage/src/data-loader.ts

**/*.{ts,tsx,js,jsx}