Data Literacy





Kelly McConville

Coffee and Treats with L&IT | Fall 2024

Data literacy: The ability to read, evaluate, and construct arguments with data.

Bombardment of Data Arguments

NYTimes “What’s Going on in this Graph? Oct 23, 2024”

NYTimes “What’s Going on in this Graph? Attempted Crossings at the U.S. Southern Border”

Bombardment of Data Arguments

NYTimes “Some Colleges Have More Students From the Top 1 Percent Than the Bottom 60. Find Yours.”

Kate Petrova on X, Nov 27, 2020

Example: COVID Prevalence

Example: Visualizing COVID Prevalence

In May of 2020, the Georgia Department of Public Health posted the following graph:

  • At a quick first glance, what story does the Georgia Department of Public Health graph appear to be telling?

  • What is misleading about the Georgia Department of Public Health graph? How could we fix this issue?

Example: Visualizing COVID Prevalence

After public outcry, the Georgia Department of Public Health said they made a mistake and posted the following updated graph:

  • How do your conclusions about COVID-19 cases in Georgia change when now interpreting this new graph?

Example: Visualizing COVID Prevalence

Alberto Cairo, a journalist and designer, created the second graph of the Georgia COVID-19 data:

  • A key principle of data visualization is to “help the viewer make meaningful comparisons”.

  • What comparisons are made easy by the lefthand graph? What about by the righthand graph?

  • From these graphs, can we get an accurate estimate of the COVID prevalence in these Georgian counties over this two week period?

Example: Visualizing COVID Prevalence

  • What are the pros of using wastewater over nasal swabs to assess COVID prevalence? What are the cons?

  • The graph also incorporates uncertainty measures. Quantifying uncertainty is a key component of data literacy.

Data Literacy In Action

  • Understanding the importance of context.

Context explains the Monday jumps in the COVID counts.

  • How we encode information in a graph should be driven by our research question.

You have a lot of design choices and these choices can help or hinder the story-telling.

  • How the data are collected impacts the conclusions we can draw.

Voluntary COVID test results likely don’t provide good estimates of COVID prevalence.

  • Often we are using a sample of data to say something about a larger group. In this case, we should measure how certain our estimates are!

They sampled the wastewater and then got a range of plausible values for the RNA copies each day.

Data Analysis Process




  • Need to understand how “raw” data are processed into insights.

  • What choices were made at each step?

  • How do those choices impact the conclusions?

Data Literacy Training

  • About developing reasoning

    • Not just learning definitions and formulae
    • Not just memorizing arbitrary rules (p-value \(<\) 0.05, sample size \(>\) 30)
  • Requires judgment that takes time to develop

    • Lots of great classes at Bucknell where students practice!

Dominguez Center for Data Science

Discussion Questions

  • How can the Center support data literacy at Bucknell?

  • What is data literacy to you?

  • How do you see generative AI changing or impacting data literacy?