This project utilizes demographic information from the Census Bureau and the Louisville Free Public Library Collection. The goal of the project is to evaluate whether the collection of each library matches the demographics of its respective neighborhood. This is the capstone project for Code:Louisville and uses Python and Tableau.
In general, the teen book collection is a much smaller percentage than the other books, this is true even in the areas with much larger teen demographics in the western and southern areas (cf. Portland, Iroquois, Shively, Southwest, South Central). This is the opposite of the children's collections. The percentage of children's books in the library collections are significantly higher than the percentage of the neighborhood population that are children at every branch except the Main branch. Though teens may not be the most engaged audience for library books, they are also developing reading skills and should have more books available to them.
Portland, Shawnee, and Fairdale have some of the lowest percentages of college graduates (and education levels in general) but fairly high percentages of books published in the past 5 years (same for books published between 2010-2019). So, the neighborhoods with higher education levels do not necessarily receive all the newest books (cf. Portland and St. Matthews/Eline or Northeast).
- The Louisville Public Library Collection downloaded from Louisville Open Data 12/18/2024
- Age Demographics by Zipcode downloaded from Census.gov 12/18/2024
- Library Spatial Data downloaded from Louisville Open Data 12/18/2024
- Zipcode Spatial Data downloaded from Louisville Open Data 12/18/2024
- Education Demographics by Zipcode downloaded from Census.gov 2/26/2025
- Data Exploration: Jupyter notebooks in VS Code.
- Analysis: Python with the Pandas and Numpy packages for data cleaning.
- Visualizations : Initial visualizations using Matplotlib in Jupyter notebooks.
- Dashboard: Created Tableau Dashboard.
To run this project, follow these steps:
- Clone the repository:
git clone https://github.com/NicholasJCampbell/library_comparisons.git - Install the necessary dependencies:
pip install -r requirements.txt - Explore the Jupyter notebooks or scripts in the respective folders.
- After you have cloned the repo to your machine, navigate to the project folder in GitBash/Terminal.
- Create a virtual environment in the project folder.
- Activate the virtual environment.
- Install the required packages.
- When you are done working on your repo, deactivate the virtual environment.
Virtual Environment Commands
| Command | Linux/Mac | GitBash |
|---|---|---|
| Create | python3 -m venv venv |
python -m venv venv |
| Activate | source venv/bin/activate |
source venv/Scripts/activate |
| Install | pip install -r requirements.txt |
pip install -r requirements.txt |
| Deactivate | deactivate |
deactivate |
| Feature | Description |
|---|---|
| Read TWO data files | Used 1 CSV file from Louisville Open Data, multiple (78 total) CSVs from the Census Bureau, 2 GeoJSONs from Louisville Open Data |
| Clean your data and perform a pandas merge with your two data sets, then calculate some new values based on the new data set. | Cleaned and merged data (two different ways) with Pandas and in Tableau. Calculated population percentages in both datasets. |
| Make 3 Matplotlib visualizations | Made several bar plots for initial data interpretation. |
| Make a Tableau dashboard | Made two Tableau dashboards. |
| Utilize a virtual environment | Made a virtual environment and included instructions in the ReadMe. |
| Annotate your code with markdown cells in Jupyter Notebook | Included notes describing each code block and section descriptions in Markdown cells. |
| Build a custom data dictionary | Created a data dictionaries markdown files (one for the final dataset of each comparison) |