-
Notifications
You must be signed in to change notification settings - Fork 6
Working with spreadsheet data
Bria Parker edited this page Sep 14, 2015
·
5 revisions
Add your thoughts below for the next meeting. We'll discuss strategies and try to develop solutions at our next meeting.
[Please add your ideas/problems below!]
- differences in line ending style (LF, CR, CRLF)
- encoding issues
- control characters and other tricky data in fields (for example hard returns inside Excel cells)
- joining two CSV tables by matching a particular field
- using regular expressions to clean up data
- splitting multi-value fields into separate columns (for instance number+street+city+state+code)
- detecting wrong formats on fields (for instance, barcodes that are too short or long)
[Please add your favorite tool below!]
- csvkit (http://csvkit.readthedocs.org)
- Regex 101 (https://regex101.com)
- OpenRefine (http://openrefine.org/)