Guidelines and Best Practices to Share Deidentified Data and Code


Monday, August 26th, 20244:00 pm – 4:45 pm ET

Presented by: Nicholas Horton (Amherst College) and Sara Stoudt (Bucknell University)


Abstract

In 2022, the Journal of Statistics and Data Science Education (JSDSE) instituted augmented requirements for authors to post deidentified data and code underlying their papers. These changes were prompted by an increased focus on reproducibility and open science, and a recent review of data availability practices noted that "such policies help increase the reproducibility of the published literature, as well as make a larger body of data available for reuse and re-analysis" (PLOS ONE, 2024). In this talk, Nicholas Horton and Sara Stoudt present their recent editorial for JSDSE, discussing the motivation and process for sharing deidentified data and code. Because institution, environment, and students differ across readers of the journal, it is especially important to facilitate the transfer of a journal article's findings to new contexts. This process may require digging into more of the details, including the deidentified data and code. The presenters will present a review of why the requirements for code and data sharing were instituted, summarize ongoing trends and developments in open science, discuss options for data and code sharing, and share advice for authors.