Cory Merill presenting

In Fall 2015, DH Fellows Rochelle Terman, PhD Candidate in Political Science, and Laura Stoker, Associate Professor in Political Science, introduced a new interdisciplinary course, “Political Science 239T: Introduction to Computational Tools and Techniques for Social Research,” with support from a Digital Humanities at Berkeley new course component grant. In this graduate seminar, open to all departments, students were introduced to a few basic skills, literacies, and best practices for gathering and analyzing data. With this foundation, students can go on to learn independently, participate in workshops and working groups at the Berkeley Institute for Data Science or the D-Lab, and take intermediate computational methods courses in other departments.

The syllabus drew upon Terman’s experience as an instructor in computational methods. At the D-Lab, Terman taught workshops on Drupal, computational text analysis, programming fundamentals, and web development. She also co-taught the intensive “Computational Text Analysis” class at the inaugural Digital Humanities at Berkeley Summer Institute. Stoker, as instructor of a traditional graduate student methods course in the Political Science department, provided a valuable perspective  on research design.

The PS239T syllabus is open-source, with lecture materials and exercises available on GitHub. Though many of the examples are drawn from Terman’s own research, which involved scraping and analyzing data from Amnesty International's Urgent Actions, the syllabus is designed to be discipline-neutral and useful for a variety of researchers.

Liz McKenna presenting mapCourse materials were divided into three modules:  skills, applications, and community engagement. In the skills portion of the course, students learned the basics of using the command line, received an overview of key programming concepts in Python and R, and learned how to use Git and GitHub. In the applications portion of the course, students used their new skills in Python and R for tasks like webscraping, accessing data via APIs, text analysis, and geospatial analysis. In the community engagement section of the course, students looked beyond their own projects to the wider programming community and examined best practices for reproducible research.

This discipline-neutral approach attracted students from disciplines such as political science,jurisprudence, sociology, geography, and comparative literature. For the last three weeks of the course, students worked on final projects such as extracting similes from modernist works of literature and scraping and transforming data from Wikileaks. Students created interactive web apps, mapped data using R’s Leaflet package, and explored natural language processing in Python. During the final week of the course, students presented their work in brief lightning talks where they shared their goals, challenges they encountered, results, and their next steps. With new data sets and tools for exploratory analysis in hand, students expressed their excitement for pursuing more ambitious projects during their future coursework and research. See the full list of student projects.

(1) Cory Merrill, PhD student in Comparative Literature, discusses her R script for extracting similes from texts. She demonstrates the script on T.S. Eliot’s The Love Song of J. Alfred Prufrock.
(2)Liz McKenna, PhD student in Sociology, displays a map of protest events in Brazil, generated with the R package, ggmap.