Skip to content

Conversation

@AnanthuNarashimman
Copy link

Description

This PR fixes a critical bug in the custom parseCSVLine logic where double quotes appearing inside a quoted field (escaped as "") were incorrectly interpreted as the end of the field.

The Bug:
Previously, the parser simply toggled the inQuotes state whenever it encountered a quote. This caused inputs like "foo""bar" to be parsed as ['foobar'] instead of ['foo"bar'], resulting in silent data corruption.

The Fix:
I have updated the parseCSVLine function to include a look-ahead check. Now, when a quote character is encountered inside a quoted field:

  1. It checks if the next character is also a quote.
  2. If yes, it treats the pair as a single literal escaped quote (") and skips the next character.
  3. If no, it functions as a standard field boundary.

Linked Issue

Fixes #23

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

I have added a new test suite in packages/core/src/tools/read-data-file.test.ts to verify the fix and prevent regressions.

Test Coverage:

  • Standard CSV: Verified simple comma-separated values.
  • Escaped Quotes: Verified inputs like "He said ""Hello""" parse correctly as He said "Hello".
  • Edge Cases: Verified empty fields, empty files, and lines with only headers.
  • Integration: Ran npm run test and confirmed all 13 new test cases pass.

Checklist

  • My code follows the code style of this project.
  • I have performed a self-review of my own code.
  • I have added tests that prove my fix is effective.
  • New unit tests pass locally with my changes

Fixes incorrect parsing of CSV fields containing double quotes. Includes unit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: CSV Data Corruption

1 participant