Skip to content

[Docs] Add caveat about autodetect_column_names #79

@AndyHunt66

Description

@AndyHunt66

https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html#plugins-filters-csv-autodetect_column_names

  • Version: 3.0.10

When using autodetect_column_names, if either

  • logstash is stopped and restarted in the middle of reading a csv file, or
  • logstash finishes reading a file with one column layout and starts reading a different file with a different column layout

then the behaviour is not what might be expected.

In the first case, the column names will be re-read from the next event where LS left off before being stopped - so the event data in that row becomes the column names for the rest of the file.

In the second case, column names are not re-read on starting a new file, so the data in the new file is treated as if it were in the format of the previous file.

Additionally, I think in the second case, the header line will be ingested as data, even if it is column names.

We should add a caveat in the docs to cover these scenarios.

Something along the lines of a note like:

When autodetect_column_names is set to true, the column names information is only parsed when Logstash starts. Refrain from using this setting if there's a chance Logstash will restart while in the middle of a file, or if you are ingesting multiple csv files which each have column names as the first line

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions