Skip to Main Content

Digital Research

This guide provides an overview of tips, support, and resources available to complete digital research projects at American University.

Text Mining and Social Media Research Basics

Social media research uses user-generated social media content and interactions as the subject of research. The tutorials below focus on the methods and important considerations of social media research, as well as a tool that can do this kind of research.

Text mining uses software to process and analyze large sets of unstructured texts to identify patterns and connections. These resources outline the basics of what text mining is, common approaches, and resources that you can use to conduct this type of analysis.

Text Mining and Social Media Research Tools

Google Ngram Viewer is a beginner, open-source text searching tool that lets you visualize and graph occurrences of words in texts located in Google Books. It's easy to use and can be a great place to start for refining your research questions and getting a brief preview of the possibilities for large-scale text analysis.

Voyant Tools is a web-based reading and analysis tool for digital texts. Voyant will takes your texts, create a corpus, and can calculate word frequencies, correlations between sets of words, commonly repeated phrases, topic clusters, and other analyses of interest to researchers. You can type in multiple URLs, paste in full text, or upload your own files for analysis.

HathiTrust Research Center Analytics is a tool that can perform large-scale text analysis on materials in the Hathi Trust Digital Library, which is is home to millions of digitized books and publications that you can use to gather a set of texts for your text mining research. The resources below provide information about how to use the HathiTrust Research Center, which you have access to through AU.

NVivo is a research tool for coding data from a variety of sources, including text-based documents, interviews, surveys, maps, audio/video files, and social media data, and then automating a variety of qualitative and quantitative research analyses for those sources. NVivo can be accessed in the Anderson Computing Complex on campus.

is an open-source software program for statistical analysis that has the ability to do some text-mining analysis as well. The resource below showcases how to use R for this purpose.