A plot of good that did

R
visualization
data sharing
Published

November 3, 2023

Plot of data shared by (R. Silberzahn et al. 2023) and described in (Gilmore 2023)

This week in my senior seminar on the reproducibility crisis in science (Gilmore 2023) we discussed the “many analysts” paper (R. Silberzahn et al. 2018). The main take home message is that different data analysts can take widely varied approaches to answering the same question from the same dataset. It’s a fun and fascinating read.

But after asking the students whether or not the paper had persuaded them that players with darker skin tones get more red cards, I realized that nowhere in the paper was a graph that addressed this particular question. So, I got busy trying to work through just how one might explore that question graphically.

The result is a whole page of graphs. Probably the least surprising are the ones showing that the number of yellow cards and red cards are related to one another, whether viewed from a player perspective

Plot of data shared by (R. Silberzahn et al. 2023) and described in (Gilmore 2023)

or from a referee perspective

Plot of data shared by (R. Silberzahn et al. 2023) and described in (Gilmore 2023)

But I confess to trying many times, and failing, to find a way to show that skin tone has anything to do with the number of red or yellow cards given out.

Not only does this remind me again how important plotting data can be (see here), but it also makes me wonder what we actually know about how scientists think about generating these sorts of graphs, how the graphs contribute to scientific reasoning, and how others understand them. Those sound like interesting research questions, don’t they?

Go, open science!

Kudos to the authors for sharing their data on the Open Science Framework (R. Silberzahn et al. 2023), and especially Brian Nosek, Jeff Spies, and their colleagues for creating a tool that makes data sharing so easy. The osfr package made it really easy for me to create a reproducible workflow that downloads the data. And Quarto made it easy for me to make those workflows transparent to others.

References

Gilmore, Rick. 2023. “The Reproducibility Crisis in Science.” https://psu-psychology.github.io/psych-490-reproducibility-2023-fall/. https://psu-psychology.github.io/psych-490-reproducibility-2023-fall/.
Silberzahn, R., Eric L Uhlmann, Daniel P Martin, Pasquale Anselmi, Frederik Aust, Eli C Awtrey, Štěpán Bahník, et al. 2023. “Many Analysts, One Dataset: Making Transparent How Variations in Analytical Choices Affect Results.” OSF. osf.io/gvm2z.
Silberzahn, R, E L Uhlmann, D P Martin, P Anselmi, F Aust, E Awtrey, Š Bahník, et al. 2018. “Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results.” Advances in Methods and Practices in Psychological Science 1 (3): 337–56. https://doi.org/10.1177/2515245917747646.