Many people use the term “data” to mean anything from statistics to general digital information. In this guide, however, we focus on what we call “research data”—the information you collect, generate, or analyze specifically in the course of academic research.
In short, research data typically consists of raw or minimally processed material that requires further analysis. In the United States, one official definition from the Office of Management and Budget’s Circular A-110 states (emphasis added):
“Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues.”
Research data can be either an input (e.g., datasets from agencies, companies, or other researchers) or an output you create via experiments, observations, or analyses. It can take many forms, including:
Usually, journal articles, summaries, or other polished outputs are not classified as “research data,” since they’re intended for direct comprehension rather than further analysis. However, even these can become “data” if used in a meta-analysis or systematic review, showing how something considered an output in one context can be input in another—part of the lifecycle of research data.
While many academic outputs (e.g., articles, conference papers, dissertations) derive from research data, they are not our focus here. We have other guides covering those. Instead, this resource aims to help you:
By following these practices, you’ll ensure your data remain valuable for both you and the wider research community long after your projects end.