Skip to content

Issues when parsing CSV files that extensively use text qualifiers #60

@mvlakh

Description

@mvlakh

Hi,

I am using Camel to process CSV files and as I understand Camel utilizes Flatpack to parse CSV content. It looks like there are several defects in Flatpack that do not allow to parse CSV files properly if they use text qualifier a lot, looks like there are several edge cases when the library cannot handle content properly:

  • if there multiline string like this one the library handles it incorrectly:
Bob,Smith,bsmiht@test.com,"This is a long fragment of text
that should be processed as a single field", 1988, 111-222-33,"another field with new line character
 that should be considered as a field of the same data row"

It looks like it tries to consume it as separate CSV rows and not as one row

  • if string starts with or contains escaped text qualifier characters that are part of the string value, the library tries to consume one string as several separate cells:
Bob, Smith,"""Test"" , 2, Some string, still string, also part of the string.",11111111

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions