This tutorial goes over how to manipulate twitter data that is already processed into a CSV. It touches on topics such as tokenization and graphing with the aim of showcasing a large varity of analytical methods. Although the included Juypter IPython notebooks contains some dummy CSV (in this case you do not have any), feel free to use your own. If you do not have your own twitter data or have it in a JSON object see the following:
This lesson is a brief introduction to string manipulation techniques in Python.
In this lesson, we will make the list we created in the ‘From HTML to a List of Words’ lesson easier to analyze by normalizing this data.
In this lesson, you will learn the Python commands needed to implement the second part of the algorithm begun in the lesson ‘From HTML to a List of Words (part 1)’.
In this two-part lesson, we will build on what you’ve learned about Downloading Web Pages with Python, learning how to remove the HTML markup from the webpage of Benjamin Bowsey’s 1780 criminal trial transcript. We will achieve this by using a variety of string operators, string methods, and close reading skills. We introduce looping and branching so that programs can repeat tasks and test for certain conditions, making it possible to separate the content from the HTML tags. Finally, we convert content from a long string to a list of words that can later be sorted, indexed, and counted.
This is an tutorial about how to use the voyant tool for text analysis. The tutorial starts with very basic function of voyant tools and then go deeper for the function that can be used in text analysis-using knot to visualize the text.
This tutorial teaches how to use string methods and regular expression to pre-processing including cleaning up the list, converting to lower case and finding sepcific information.
This tutorial is divided into 6 parts; they are:
- Metamorphosis by Franz Kafka
- Text Cleaning is Task Specific
- Manual Tokenization
- Tokenization and Cleaning with NLTK
- Additional Text Cleaning Considerations
- Tips for Cleaning Text for Word Embedding
This tutorial is about how to chart time series data with line plots and categorical quantities with bar charts. How to summarize data distributions with histograms and box plots. How to summarize the relationship between variables with scatter plots.
The goal of text classification is to automatically classify the text documents into one or more defined categories. In this tutorial, the author will explain about the text classification and the step by step processing to implement it in python.