diff --git a/01-intro/deprecated-index.qmd b/01-intro/_deprecated-index.qmd similarity index 100% rename from 01-intro/deprecated-index.qmd rename to 01-intro/_deprecated-index.qmd diff --git a/01-intro/jupyter-notebook.qmd b/01-intro/jupyter-notebook.qmd index c83a5f4..f50c679 100644 --- a/01-intro/jupyter-notebook.qmd +++ b/01-intro/jupyter-notebook.qmd @@ -25,7 +25,7 @@ In order to run computer programs, we need a way to execute code written in a pr The environment we will use is **Jupyter Notebook**, which allows us to write and run code within a single `.ipynb` document (i.e., **notebook**). They also allow us to embedded text and code. :::{style="text-align: center"} -![An example of a Jupyter Notebook.](images/jupyter-oski.png){#fig-inflation fig-align=center width=90%} +![An example of a Jupyter Notebook.](images/jupyter-oski.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: There's a lot going on in the above Jupyter Notebook screenshot: there is code, there is output from running code, there are pictures, and there is (non-code) text. We'll get to understanding all of these components in due time. @@ -78,7 +78,7 @@ Jupyter Notebooks are made up of **cells**. There are two main types of cells: When run, Python code cells are evaluated as a Python code snippet, one line at a time. The cell output displayed is the value of the _last_ evaluated expression: :::{style="text-align: center"} -![Both expressions are evaluated, but the result of the last expression's evaluation is considered the output of the code cell.](images/jupyter-code-cell.png){#fig-inflation fig-align=center width=70%} +![Both expressions are evaluated, but the result of the last expression's evaluation is considered the output of the code cell.](images/jupyter-code-cell.png){#fig-inflation fig-align=center width=70% fig-alt=""} ::: We will discuss this output/display phenomenon more in future notes. @@ -88,7 +88,7 @@ To run a code cell, you can either hit the "Run" button in the Toolbar, or you c **Markdown cells.** This is where you write text and images that aren’t Python code. Markdown is a language used for formatting text. A Markdown cell will always display its formatting when it is not in edit mode. :::{style="text-align: center"} -![Left screenshot shows un-evaluated code cell and raw Markdown cell; right screenshot shows evaluated code cell and formatted text. To render formatted text for a selected markdown cell, exit editing mode for that cell. This screenshot starts with the code cell selected, then runs both that code cell and "runs" the markdown cell below.](images/jupyter-md-cell.png){#fig-inflation fig-align=center width=100%} +![Left screenshot shows un-evaluated code cell and raw Markdown cell; right screenshot shows evaluated code cell and formatted text. To render formatted text for a selected markdown cell, exit editing mode for that cell. This screenshot starts with the code cell selected, then runs both that code cell and "runs" the markdown cell below.](images/jupyter-md-cell.png){#fig-inflation fig-align=center width=100% fig-alt=""} ::: Here is a [guide to Markdown formatting](https://www.markdownguide.org/cheat-sheet/). You’ll explore Markdown more in lab. diff --git a/05-variables/index.qmd b/05-variables/index.qmd index bc33d26..d805ed6 100644 --- a/05-variables/index.qmd +++ b/05-variables/index.qmd @@ -30,7 +30,7 @@ It is challenging to use another person's data! The concepts have already been o For now, we focus on variables as they exist in tabular data. In most of the tabular datasets we will examine, variables correspond to **columns** of features. Each row is a **record** of a datapoint, with different values of variables measured for that datapoint. :::{style="text-align: center"} -![Variables as columns.](images/variable.png){#fig-inflation fig-align=center width=60%} +![Variables as columns.](images/variable.png){#fig-inflation fig-align=center width=60% fig-alt=""} ::: @@ -49,7 +49,7 @@ Figure 2 has examples of each variable type. :::{style="text-align: center"} -![Variable Types.](images/variable_types.png){#fig-inflation fig-align=center width=90%} +![Variable Types.](images/variable_types.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: _What do we mean by "meaningful" arithmetic?_ From [Stat 20](https://www.stat20.org/1-questions-and-data/02-taxonomy-of-data/notes): diff --git a/05-variables/units-of-analysis.qmd b/05-variables/units-of-analysis.qmd index a5ade5c..84cddfb 100644 --- a/05-variables/units-of-analysis.qmd +++ b/05-variables/units-of-analysis.qmd @@ -53,13 +53,13 @@ Let's return to our American Community Survey (ACS) 2020 data. It shows educatio From the [ACS webpage](https://www.census.gov/programs-surveys/acs/methodology/design-and-methodology.html), the American Community Survey (ACS) is an ongoing monthly survey that collects detailed housing and socioeconomic data. :::{style="text-align: center"} -![ACS Household survey, which collects data on individual households.](images/acs_screenshot.png){#fig-inflation fig-align=center width=90%} +![ACS Household survey, which collects data on individual households.](images/acs_screenshot.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: There are (at least) two datasets collected by the ACS: A private dataset of survey responses by household (Figure 1), and a public-facing dataset of responses by geographic region. The variables for the geographic region, a larger unit of analysis, are constructed via aggregation and estimation (Figure 2): :::{style="text-align: center"} -![ACS reported public data, which reports aggregated data of households across a geographic region.](images/acs_aggregate.png){#fig-inflation fig-align=center width=90%} +![ACS reported public data, which reports aggregated data of households across a geographic region.](images/acs_aggregate.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: Simple forms of aggregation are straightforward and involve counting and averaging---methods that are very possible using our limited Data Science toolkit thus far. However, disaggregation cannot be done without individual datapoints! There are various methods of estimating individuals from averages using statistics and distributions; we discuss this briefly in a few weeks, but you can take a statistics course for more information. diff --git a/06-variables-ii/deprecated-eda.qmd b/06-variables-ii/_deprecated-eda.qmd similarity index 100% rename from 06-variables-ii/deprecated-eda.qmd rename to 06-variables-ii/_deprecated-eda.qmd diff --git a/06-variables-ii/deprecated-index.qmd b/06-variables-ii/_deprecated-index.qmd similarity index 100% rename from 06-variables-ii/deprecated-index.qmd rename to 06-variables-ii/_deprecated-index.qmd diff --git a/06-variables-ii/sample-population.qmd b/06-variables-ii/sample-population.qmd index 79e180d..563013c 100644 --- a/06-variables-ii/sample-population.qmd +++ b/06-variables-ii/sample-population.qmd @@ -16,7 +16,7 @@ The set of individuals we actually draw our sample from is the **sampling frame* ## Examples :::{style="text-align: center"} -![A sampling frame may include individuals not in our population.](images/sampling-frame.png){#fig-inflation fig-align=center width=80%} +![A sampling frame may include individuals not in our population.](images/sampling-frame.png){#fig-inflation fig-align=center width=80% fig-alt=""} ::: | Target Population | Collected sample | diff --git a/07-visualizations/encoding.qmd b/07-visualizations/encoding.qmd index 651b718..3e53069 100644 --- a/07-visualizations/encoding.qmd +++ b/07-visualizations/encoding.qmd @@ -16,7 +16,7 @@ Think of encoding as the bridge between your data and what people see on the scr In bar charts, **length** can visually encode a numerical variable. :::{style="text-align: center"} -![Bar Chart Example](images/barchart.png){#fig-barchart fig-align=center width=70%} +![Bar Chart Example](images/barchart.png){#fig-barchart fig-align=center width=70% fig-alt=""} ::: This creates an intuitive mapping where the visual property (bar length) directly corresponds to the data value (average age). @@ -27,7 +27,7 @@ This creates an intuitive mapping where the visual property (bar length) directl Other visualizations can include multiple variables encoded simultaneously. :::{style="text-align: center"} -![Multiple Encodings in a Scatter Plot](images/scatter.png){#fig-scatter fig-align=center width=80%} +![Multiple Encodings in a Scatter Plot](images/scatter.png){#fig-scatter fig-align=center width=80% fig-alt=""} ::: ### Quick Check: How Many Variables? @@ -48,7 +48,7 @@ Look at the scatter plot above. How many different variables are being encoded? As we learned when studying variables, different variable types (numerical vs. categorical, discrete vs. continuous, ordinal vs. nominal) have different properties. When creating visualizations, we need to match our encoding choices to these variable types. :::{style="text-align: center"} -![Recall: Variable Types](images/variable_types.png){#fig-variable-types fig-align=center width=90%} +![Recall: Variable Types](images/variable_types.png){#fig-variable-types fig-align=center width=90% fig-alt=""} ::: ::: {.callout-important title="Key Principle"} @@ -69,7 +69,7 @@ The table below summarizes which visual encodings work best for different types ### What's Wrong with This? :::{style="text-align: center"} -![Problematic Car Manufacturer Chart](images/cars-graph.png){#fig-cars-graph fig-align=center width=70%} +![Problematic Car Manufacturer Chart](images/cars-graph.png){#fig-cars-graph fig-align=center width=70% fig-alt=""} ::: **Problem**: This graph implies that Swedish cars are "greater" than cars from other countries in some sense, when they're not. If the variable is just "country of origin" (nominal categorical), using length encoding suggests an ordering that doesn't exist. diff --git a/07-visualizations/index.qmd b/07-visualizations/index.qmd index 34c4a12..d82fb28 100644 --- a/07-visualizations/index.qmd +++ b/07-visualizations/index.qmd @@ -25,7 +25,7 @@ To better understand these principles in action, let's examine how humans have u What do you see when you look at this ancient artifact? :::{style="text-align: center"} -![The World's First Map](images/world-map.jpg){#fig-ancient-map fig-align=center width=60%} +![The World's First Map](images/world-map.jpg){#fig-ancient-map fig-align=center width=60% fig-alt=""} ::: This is a map depicting the town of Konya, Turkey - supposedly the world's first map, dating back to approximately 6200 BC. Even in prehistoric times, humans recognized the power of visual representation to communicate spatial relationships and important information. @@ -47,7 +47,7 @@ One of the most famous examples of data visualization directly saving human live **The Solution**: Dr. John Snow was skeptical of the miasma theory and suspected contaminated water. He created a revolutionary approach that became standard in epidemiology: **he drew a map**. :::{style="text-align: center"} -![John Snow's Cholera Map](images/john-snow-cholera-map.png){#fig-cholera-map fig-align=center width=70%} +![John Snow's Cholera Map](images/john-snow-cholera-map.png){#fig-cholera-map fig-align=center width=70% fig-alt=""} ::: **What the map revealed**: @@ -64,7 +64,7 @@ One of the most famous examples of data visualization directly saving human live Florence Nightingale wasn't just a pioneering nurse, she was also an innovative data visualizer. During the Crimean War, she created what's now called a "rose diagram" or "coxcomb chart" to visualize the causes of death among British soldiers. :::{style="text-align: center"} -![Florence Nightingale's Rose Diagram](images/florence-nightingale-rose.png){#fig-rose-diagram fig-align=center width=60%} +![Florence Nightingale's Rose Diagram](images/florence-nightingale-rose.png){#fig-rose-diagram fig-align=center width=60% fig-alt=""} ::: Her visualization revealed a shocking truth: more soldiers were dying from preventable diseases than from battle wounds. This wasn't just a pretty chart, it was a powerful argument that drove major reforms in military medical care. Nightingale understood that abstract statistics about mortality rates couldn't compete with the visual impact of her rose petals, where the size of each segment made the disparity impossible to ignore. @@ -76,7 +76,7 @@ Her visualization revealed a shocking truth: more soldiers were dying from preve Not all data visualization involves charts and graphs. Maya Lin's Vietnam War Memorial in Washington DC proves that data can be deeply emotional and memorial, not just analytical. :::{style="text-align: center"} -![Vietnam War Memorial](images/veitnam-war-memorial.png){#fig-vietnam-memorial fig-align=center width=70%} +![Vietnam War Memorial](images/veitnam-war-memorial.png){#fig-vietnam-memorial fig-align=center width=70% fig-alt=""} ::: Each of the 58,000+ names etched into the black granite represents one life lost. The chronological arrangement tells the story of the war's progression through time, while the reflective surface creates an intimate connection between viewers and the data you literally see yourself reflected among the names. This memorial demonstrates that the most powerful visualizations don't just inform us; they transform how we feel about the information. @@ -86,7 +86,7 @@ Each of the 58,000+ names etched into the black granite represents one life lost During the COVID-19 pandemic, data visualization became part of daily life. Suddenly, everyone from epidemiologists to elementary school students was reading line charts showing case trends and interpreting what those curves meant for their communities. :::{style="text-align: center"} -![COVID-19 Case Tracking](images/coivd.png){#fig-covid-dashboard fig-align=center width=80%} +![COVID-19 Case Tracking](images/coivd.png){#fig-covid-dashboard fig-align=center width=80% fig-alt=""} ::: Google's COVID tracking dashboard exemplified how modern visualization must be both accessible and updateable in real-time. The time series charts showed trends over months with clear visual indicators of peaks and valleys, but more importantly, they translated complex epidemiological data into something any concerned citizen could understand. diff --git a/08-histograms/exercises.qmd b/08-histograms/exercises.qmd index dc45432..cad3216 100644 --- a/08-histograms/exercises.qmd +++ b/08-histograms/exercises.qmd @@ -121,12 +121,14 @@ studio_distribution.show(6) ``` ```{python} +#| fig-alt: "Distribution of studios responsible for the highest grossing movies as of 2017" studio_distribution.barh('Studio') ``` Let's revisualize this barchart to display just the top five studios. In the below code, note how `.take` is used with `np.arange`: ```{python} +#| fig-alt: "Distribution of studios responsible for the top five highest grossing movies as of 2017" studio_distribution.sort('count', descending=True).take(np.arange(5)).barh('Studio') print("Five studios are largely responsible for the highest grossing movies") ``` @@ -157,12 +159,15 @@ min(ages), max(ages) If you want to make equally sized bins, `np.arange()` is a great tool to help you. ```{python} +#| fig-alt: "Histogram of the age of the top grossing movies as of 2017 with equally sized bins and count on the y-axis" top_movies.hist('Age', bins = np.arange(0, 110, 10), unit = 'Year', density=False) ``` ## Histograms: Density ```{python} +#| fig-alt: "Histogram of the age of the top grossing movies as of 2017 with equally sized bins and 'Percent per Year' on the y-axis" + # default is density=True top_movies.hist('Age', bins = np.arange(0, 110, 10), unit = 'Year') ``` @@ -196,6 +201,7 @@ binned_data ### Now, plot the histogram! ```{python} +#| fig-alt: "Histogram of the age of the top grossing movies as of 2017 using custom bins" top_movies.hist('Age', bins = my_bins, unit = 'Year') ``` @@ -276,6 +282,7 @@ To check our work one last time, let's see if the numbers in the last column mat ```{python} +#| fig-alt: "Histogram of the age of the top grossing movies as of 2017 using custom bins" top_movies.hist('Age', bins = my_bins, unit = 'Year') ``` @@ -296,6 +303,7 @@ flavor_table ```{python} +#| fig-alt: "Distribution of ice cream flavors" flavor_table.barh('Flavor') ``` @@ -309,6 +317,7 @@ cone_average_price_table ```{python} +#| fig-alt: "Plot with one categorical attribute and one numerical attribute." cone_average_price_table.barh('Flavor') ``` @@ -324,5 +333,6 @@ cones_pivot_table ```{python} +#| fig-alt: "Plot with two categorical attributes." cones_pivot_table.barh('Color') ``` diff --git a/17-dictionaries/file-formats.qmd b/17-dictionaries/file-formats.qmd index e92e5be..37fbb1f 100644 --- a/17-dictionaries/file-formats.qmd +++ b/17-dictionaries/file-formats.qmd @@ -16,7 +16,7 @@ We can use files to generate tables, or other useful data structures Files are often stored in **folders**. :::{style="text-align: center"} -![Within files are folders. A file can be loaded into Python.](images/directory_structure.png){#fig-inflation fig-align=center width=90%} +![Within files are folders. A file can be loaded into Python.](images/directory_structure.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: We can categorize data as being in one of two broad categories: @@ -71,7 +71,7 @@ In our example `pups` case, the `pups.csv` file is located in the `data` directo What kinds of data can’t be stored in a tabular format? Lots of things: music, videos, maps, etc. Graph data and hierarchical data, like family trees, might also be non-tabular. :::{style="text-align: center"} -![A family tree graph structure. At the root is Grandma, who has children Dad and Aunt. Dad has children Brother and Me, and Aunt has children Cousin 1 and Cousin 2. Cousin 2 has a one child, Cousin 2 Jr.](images/trees.png){#fig-inflation fig-align=center width=75%} +![A family tree graph structure. At the root is Grandma, who has children Dad and Aunt. Dad has children Brother and Me, and Aunt has children Cousin 1 and Cousin 2. Cousin 2 has a one child, Cousin 2 Jr.](images/trees.png){#fig-inflation fig-align=center width=75% fig-alt=""} ::: ### JSON diff --git a/17-dictionaries/index.qmd b/17-dictionaries/index.qmd index 270257d..4c6548c 100644 --- a/17-dictionaries/index.qmd +++ b/17-dictionaries/index.qmd @@ -116,7 +116,7 @@ dog dog.pop(4, None) ``` -Why? Find out the answer in the official [Python documentation on `pop`](https://docs.python.org/3/library/stdtypes.html#dict.pop)! +Why? Find out the answer in the official [Python documentation](https://docs.python.org/3/library/stdtypes.html#dict.pop) on `pop`! ## Dictionary Properties diff --git a/18-html/genius.qmd b/18-html/genius.qmd index 10b10e9..f2accca 100644 --- a/18-html/genius.qmd +++ b/18-html/genius.qmd @@ -29,7 +29,7 @@ To use the [Genius Lyrics](http://genius.com/) API, you need a special API key, :::{style="text-align: center"} -![Genius API webpage.](images/Genius-API.png){#fig-inflation fig-align=center width=90%} +![Genius API webpage.](images/Genius-API.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: You'll be prompted to sign up for [a Genius account](https://genius.com/signup_or_login), which is required to gain API access. Signing up for a Genius account is free and easy. You just need a Genius nickname (which must be one word), an email address, and a password. @@ -38,7 +38,7 @@ Once you're signed in, you should be taken to [https://genius.com/api-clients](h :::{style="text-align: center"} -![New API Client button.](images/Genius-New-API.png){#fig-inflation fig-align=center width=90%} +![New API Client button.](images/Genius-New-API.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: After clicking "New API Client," you'll be prompted to fill out a short form about the "App" that you need the Genius API for. You only need to fill out "App Name" and "App Website URL." diff --git a/18-html/index.qmd b/18-html/index.qmd index 9ccc55b..072c61c 100644 --- a/18-html/index.qmd +++ b/18-html/index.qmd @@ -28,7 +28,7 @@ Your screen should look (something) like this: :::{style="text-align: center"} -![Kittens Dev Tools.](http://static.decontextualize.com/snaps/kittens-dev-tools.png){#fig-inflation fig-align=center width=100%} +![Kittens Dev Tools.](http://static.decontextualize.com/snaps/kittens-dev-tools.png){#fig-inflation fig-align=center width=100% fig-alt=""} ::: In the upper panel, you see the web page you're inspecting. In the lower panel, you see a version of the HTML source code, with little arrows next to some of the lines. (The little arrows allow you to collapse parts of the HTML source that are hierarchically related.) As you move your mouse over the elements in the top panel, different parts of the source code will be highlighted. Chrome is showing you which parts of the source code are causing which parts of the page to show up. Pretty spiffy! diff --git a/21-genai/gemini.qmd b/21-genai/gemini.qmd index 73f00c1..e44dc20 100644 --- a/21-genai/gemini.qmd +++ b/21-genai/gemini.qmd @@ -69,7 +69,7 @@ Consider the chat prompt shown in the screenshot, as well as (the start of) the :::{style="text-align: center"} -![A screenshot of a Google Gemini chat conversation. Prompt is 'Explain how AI works in a few words.'. Response from Gemini chat is long but gets at the idea..](images/prompt-chat.png){#fig-inflation fig-align=center width=90%} +![A screenshot of a Google Gemini chat conversation. Prompt is 'Explain how AI works in a few words.'. Response from Gemini chat is long but gets at the idea..](images/prompt-chat.png){#fig-inflation fig-align=center width=90% fig-alt=""} ::: We define three pieces of terminology to describe what is happening in the above screenshot: diff --git a/_quarto.yml b/_quarto.yml index bba6bd3..4dba3c2 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -173,10 +173,14 @@ website: format: html: - theme: lumen + theme: cosmos fontsize: 1em - css: assets/styles.css + css: + - assets/styles.css + - assets/custom-error-colors.css toc: true include-in-header: file: siteimprove.html + include-after-body: + - assets/a11y-fixes.html diff --git a/assets/a11y-fixes.html b/assets/a11y-fixes.html new file mode 100644 index 0000000..b9aac5e --- /dev/null +++ b/assets/a11y-fixes.html @@ -0,0 +1,56 @@ + \ No newline at end of file diff --git a/assets/custom-error-colors.css b/assets/custom-error-colors.css new file mode 100644 index 0000000..59305d6 --- /dev/null +++ b/assets/custom-error-colors.css @@ -0,0 +1,42 @@ +/* + * Fix WCAG color contrast issues for ANSI colors in Quarto cell error outputs. + * These colors are high-contrast replacements for default ANSI light-colors. + */ + +.cell-output-error .ansi-red-fg, +.cell-output-error .ansi-bright-red-fg { + /* Replace light red/bright red with a darker, accessible red */ + color: #B30000; +} + +.cell-output-error .ansi-green-fg, +.cell-output-error .ansi-bright-green-fg { + /* Replace light green/bright green with a darker, accessible green */ + color: #006600; +} + +.cell-output-error .ansi-cyan-fg, +.cell-output-error .ansi-bright-cyan-fg { + /* Replace light cyan/bright cyan with a darker, accessible cyan/teal */ + color: #006161; +} + +.cell-output-error .ansi-yellow-fg, +.cell-output-error .ansi-bright-yellow-fg { + color: #886A00; +} + +/* Ensure the general error text is also high contrast (often dark red/maroon) */ +.cell-output-error { + /* This targets the error text that isn't specifically ANSI colored */ + color: #990000; /* Darker red for general error text */ +} + +/* + * Part of the output is highlighted with a yellow background. Force all text + * on yellow backgrounds to be black for maximum contrast. + */ +.cell-output-error .ansi-yellow-bg, +.cell-output-error .ansi-bright-yellow-bg { + color: #000000 !important; +} \ No newline at end of file