Students looking at laptops intently

Digital Humanities at Berkeley hosted its inaugural Summer Institute (DHBSI) from August 17th - 21st. Instructors came from a variety of campus partner organizations, such as the D-Lab, the Library, and Research IT. The Summer Institute’s sixty students comprised undergraduates, graduate students, faculty, and staff from departments and units such as Linguistics, Education, Art History, Music, Geography, History, the Library, and the Center for New Music & Audio Technologies. During project development time, students worked on endeavors such as parsing a 19th century US agricultural census, metrical analysis of medieval German, planning data structures for a network of Vietnamese intellectuals, and reconceptualizing a large database of linguistic annotations.

In “Computational Text Analysis”, students worked with R and learned about some of the statistical foundations of text analysis. In “Data Workflows and Network Analysis” students used OpenRefine to clean a variety of complex and inconsistent data sets. Students then visualized data in Gephi and received a rudimentary introduction to network theory. In the “Geospatial Analysis” class, students learned about working with and creating spatial data in ArcGIS, as well as considerations for undertaking web mapping projects. In “Database Development Using Drupal”, participants learned the basics of an open source content management system and began developing data models for their projects.

Data, Corpora, and Stewardship

Alan Liu at presenter's podium with topic model text project behind him

In an opening keynote panel, DH at Berkeley was joined by Alan Liu, Professor of English at the University of California, Santa Barbara, and Clifford Lynch, Director of the Coalition for Network Information and Adjunct Professor at the School of Information. In “N + 1: A Plea for Cross-Domain Data in the Digital Humanities”, Liu encouraged members of the field to proactively work with messy, cross-domain, difficult data. Approaching DH from the information studies perspective, Lynch discussed how recent developments in technology have shifted the scale, significance, and interconnectedness of DH research.

Critical Approaches in the Digital Humanities

Critical Approaches panel: Amy Earhart, Michael Dumas, Francesco Spagnolo, Abigail de Kosnik

Facilitated by Francesco Spagnolo, Curator of the Magnes Collection for Jewish Art & Life, the Critical Approaches panel shifted away from tool-oriented approaches to DH and focused on intersectional discussions of digital work and digital representation.

Amy Earhart, Associate Professor of English at Texas A&M University, presented a history of digital humanities’ precursors, shedding new light on its roots in social justice and activist work in the early days of the internet. Michael Dumas, Assistant Professor at the University of California, Berkeley in the Graduate School of Education and the African American Studies Department, discussed current and past representations of black suffering and how the capture, archiving, and narrative of these events contrast with black experience. Abigail de Kosnik, Assistant Professor at the University of California, Berkeley with a joint appointment in the Berkeley Center for New Media and in the Department of Theater, Dance & Performance Studies, presented recent work from the Fan Data & Net Differences project, where students scraped and analyzed several sites powered by fan-provided content (such as Identifying these sites as a key safe space for queer communities, De Kosnik pointed to fan data sites as a place to study queer communities and creators en masse.

DH Pedagogy

Throughout the week, presenters joined the Institute to discuss pedagogy. Greg Niemeyer and DH Intern MacKenzie Alessi presented ongoing work with visualizing and analyzing class engagement data. Richard Freishtat, Director for the Center for Teaching and Learning (CTL), led a workshop on course design. Noah Wittman of Educational Technology and Services presented on instructional technology services available on campus. Rita-Marie Conrad, Senior Consultant at CTL, discussed adapting teaching to digital classrooms.

Natural Language Processing for the Long Tail
David Bamman presenting graph showing newswire, twitter, and commercial text dominates NLP research

In the closing event for the Summer Institute, David Bamman, incoming Assistant Professor at the School of Information, discussed the limitations of current natural language processing research and tools. Though many NLP tools in the field perform well with modern English-language newspapers, these tools underperform or are underdeveloped in areas such as literary English, non-English languages, and historical languages. Noting that many humanist scholars already do the painstaking work of annotating corpora by hand, Bamman proposed the creation of a repository of annotated texts that might form the basis of future NLP research.

Join us next year

DH at Berkeley looks forward to planning the 2016 Summer Institute. We would love to hear your thoughts on instructors, course offerings, and speakers throughout the year. Contact us at


Explore archived information from the Summer Institute. Photos from the events are available on DH at Berkeley’s Flickr page. BCNM has also curated tweets and photos of several DHBSI events on Storify:

Images (CC:BY DH at Berkeley, Quinn Dombrowski) : (1) Kevin Block, Janet Torres, and DH Fellows Laurie Pearce and Eduardo Escobar at work in "Data Workflows and  Network Analysis" (2) Alan Liu presents the WhatEvery1Says project at a keynote presentation (3) Amy Earhart, Michael Dumas, Francesco Spagnolo, and Abigail de Kosnik in discussion at the "Critical Approaches in the Digital Humanities" panel (4) David Bamman discusses the inadequacies of current NLP research at "NLP for the Long Tail"