- A working definition of "text analysis"
- An idea of how to approach text analysis (i.e where to start and some of the methods/tools needed)
- Key terminology:
- Data / Capta
- "scraping"and "cleaning" data
- Metadata
- schemas, ontologies, (un)controlled vocabularies
- Resources for additional support (ex. WFU DISC, Collective Notes)
Methods:
- Creating and structuring data for analysis (ex. scraping, OCR with Tasseract and .txt files)
- Tidying Data
- Strategies for "cleaning" data (OpenRefine). Examples include:
- removing incorrect charachters
- removing running headers
- standardizing metadata
- Data visualization (Flourish)
- Word Frequencies (Voyant)
- Keywords (Voyant)
- TF/IDF (Voyant)
- Topic Modeling (TPM Tool)