-
Notifications
You must be signed in to change notification settings - Fork 2
Reference
Michael Penkov edited this page Oct 29, 2017
·
1 revision
This page contains some reference material.
Python 2 and 3 handle CSV I/O differently. Python 2 reads and writes bytes, whereas Python 3 reads and writes strings. You can get Python 2 to behave the same as Python 3 by using backports.csv but that is very slow. For more details, see this repo.
Since speed is important, CsvInsight always uses the standard library's csv module and applies the following hack:
- If running under Py2, expect column values to be bytes
- Otherwise, expect column values to be strings
This is an implementation detail, so it isn't something the regular user gets to see. The end result is the same: the CSV file gets split into columns, one file per column. All subsequent processing is identical.