Many people use the term “data” to mean anything from statistics to general digital information. In this guide, however, we focus on what we call “research data”—the information you collect, generate, or analyze specifically in the course of academic research.
What Is Research Data?
In short, research data typically consists of raw or minimally processed material that requires further analysis. In the United States, one official definition from the Office of Management and Budget’s Circular A-110 states (emphasis added):
“Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues.”
Research data can be either an input (e.g., datasets from agencies, companies, or other researchers) or an output you create via experiments, observations, or analyses. It can take many forms, including:
- Sound recordings from interviews (e.g., a population sample)
- Text documents (e.g., poems for a word-pattern analysis)
- Images of celestial bodies (e.g., for astronomy research)
- Spreadsheets or databases containing survey or experimental data
What Isn’t Research Data?
Usually, journal articles, summaries, or other polished outputs are not classified as “research data,” since they’re intended for direct comprehension rather than further analysis. However, even these can become “data” if used in a meta-analysis or systematic review, showing how something considered an output in one context can be input in another—part of the lifecycle of research data.
Scope & Purpose of This Guide
While many academic outputs (e.g., articles, conference papers, dissertations) derive from research data, they are not our focus here. We have other guides covering those. Instead, this resource aims to help you:
- Make your research data discoverable and reusable
- Publish or share it effectively
- Protect it when needed (e.g., for confidentiality or privacy)
- Preserve it for future use and validation
By following these practices, you’ll ensure your data remain valuable for both you and the wider research community long after your projects end.