Twitter Analysis and Text Research (TATR)

Subject of Tutorial 
Twitter Analysis and Text Research (TATR)
Kynan Ly

There are many different methods and ways to study Twitter data. Unlike more traditional text (e.g. books), discourse on twitter can encompass a wide range of perspectives, ideas, thoughts, and context. The analysis of this discourse is something that needs that requires different cleaning methods, refinement, and categorization.

This tutorial goes over how to manipulate twitter data that is already processed into a CSV. Although the included links and tutorials do not encompass all different methods to manipulate twitter data, it is a subset of them. The reason for these methods being chosen over others is their ease of addition/changes/modification to suit one's need and offers the most flexiblity to the user.

Some of the included topics are tokenization and graphing. The overall aim of showcasing different topics is to demostrate that there exist many different analytical methods that can be applied to the Twitter data. Although the included Juypter IPython notebooks contains some dummy CSV (in this case you do not have any), feel free to use your own. If you do not have your own twitter data or already collected them as JSON object there is some additional steps that needs to be taken. See the following links as they apply:

For those that do not have any Twitter data and is starting out on twitter scraping / experimenting with twitter analysis see the following useful link:

For those that already have their twitter data in a JSON format (the default format Twitter returns their tweets in) this a link demostrate a method that turns JSON:


Following the order of the recipes as they are listed in the "Recipes Used". Treat the two recipes about Panda as one single two part recipes.


This tutorial and it accompanying recipes give a overview on what text analysis on twitter data looks like. This tutorial does not go too indepth into each method, however it serves as a good introduction to twitter text analysis.