Skip to content

Reference

Michael Penkov edited this page Oct 29, 2017 · 1 revision

This page contains some reference material.

CSV I/O

Python 2 and 3 handle CSV I/O differently. Python 2 reads and writes bytes, whereas Python 3 reads and writes strings. You can get Python 2 to behave the same as Python 3 by using backports.csv but that is very slow. For more details, see this repo.

Since speed is important, CsvInsight always uses the standard library's csv module and applies the following hack:

  • If running under Py2, expect column values to be bytes
  • Otherwise, expect column values to be strings

This is an implementation detail, so it isn't something the regular user gets to see. The end result is the same: the CSV file gets split into columns, one file per column. All subsequent processing is identical.

Clone this wiki locally