Skip to content

How to disable quote processing entirely? #29

@chrisinmtown

Description

@chrisinmtown

I parse pipe-separated data using libcsv. The data may have quotes in it, those should be completely ignored. What is the recommended practice? Maybe if I set value 1 for quote, which I guess would be the code for control-A?

    csv_set_quote(&cp, quote)

I'm asking because the following pipe-separated input line hit us recently and when libcsv processed it, the library yielded an impossibly long output record; 1,864,280,683 bytes to be precise:

ABC|jkkdf|1664550195943489|28|0|"wxyz.th|"wxyz.th|::|||17301

I consider this a bug. Making this even trickier to debug, when my program is compiled on OSX with Apple clang, libcsv processes an input file with this line as I expect, no large output. When my program is compiled on Ubuntu bullseye with gcc, that's when we see the bad behavior.

Maybe you feel I have misused the library? Here's how I am using it. First, I'm using default (not strict) checking of quotes:

    struct csv_parser cp;
    int rc = csv_init(&cp, 0);

Second, I left the parser's config at the default value of 0 for the quote character.

Thanks in advance.

p.s. I revised this issue to ask my question, no longer trying to report a bug. I am still struggling to produce a sanitized data file that reproduces the problem. The line shown above is the content that begins the impossibly long line in the output, but when processed alone in a one-line file, the library signals error immediately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions