Event date
Tuesday, September 12, 2017
Event time
1:00pm to 2:00pm
Barrows 356: Convening Room

Getting research materials in a digital form that you can search and computationally analyze can be a time-consuming initial step in the research process. While Adobe Acrobat can do basic optical character recognition (OCR, transforming an image of a text into editable text), it performs poorly on documents with complex layouts or non-English text.
This workshop will cover how to use ABBYY FineReader, professional-level OCR software, via the OCR virtual research desktop provided by Research IT or in the D-Lab. It will also briefly cover the pros and cons of FineReader compared to the open-source OCR package Tesseract, and how you can use Tesseract on the Savio high-performance compute cluster for large-scale OCR jobs.
Prior knowledge: No prior knowledge is required for this workshop. Register if you have any interest in learning more about OCR tools and resources.
Technology requirement: None. This workshop will demonstrate realistic applications of OCR software.