Event date
Tuesday, September 19, 2017
Event time
1:00pm to 4:00pm
Barrows 356: Convening Room

Students will learn the basics of cleaning, transforming, and formatting text data. They will pull specific elements out of text strings, and pull simple metrics from text data, such as word counts, syntax quantification via part of speech (POS) tagging, and sentiment polarity. Students will be introduced to topic modeling and word2vec methods. The libraries used are NLTK, TextBlob, and gensim. This is an interactive, hands-on workshop, in which students will complete challenges related to each text analysis task.
Prior knowledge: Completion of D-Lab's Python for Everything Series.
Technology Requirements: Laptop required; please install the Anaconda distribution of Python 3 or its equivalent. The workshop will utilize the Jupyter Notebook, but IDEs are also acceptable.
Please install the python packages “gensim”, “textblob” and “NLTK”:
pip install gensim
pip install nltk
pip install textblob
Or if you have anaconda:
conda install gensim
conda install nltk
conda install textblob
Link: Install the Anaconda distribution of Python 3Jupyter Notebook