There are many different methods and ways to study Twitter data. Unlike more traditional text (e.g. books), discourse on twitter can encompass a wide range of perspectives, ideas, thoughts, and context. The analysis of this discourse is something that needs that requires different cleaning methods, refinement, and categorization.
This lesson is a brief introduction to string manipulation techniques in Python.
In this lesson, we will make the list we created in the ‘From HTML to a List of Words’ lesson easier to analyze by normalizing this data.
In this lesson, you will learn the Python commands needed to implement the second part of the algorithm begun in the lesson ‘From HTML to a List of Words (part 1)’.
In this two-part lesson, we will build on what you’ve learned about Downloading Web Pages with Python, learning how to remove the HTML markup from the webpage of Benjamin Bowsey’s 1780 criminal trial transcript. We will achieve this by using a variety of string operators, string methods, and close reading skills. We introduce looping and branching so that programs can repeat tasks and test for certain conditions, making it possible to separate the content from the HTML tags. Finally, we convert content from a long string to a list of words that can later be sorted, indexed, and counted.
This is an tutorial about how to use the voyant tool for text analysis. The tutorial starts with very basic function of voyant tools and then go deeper for the function that can be used in text analysis-using knot to visualize the text.
This tutorial is divided into 6 parts; they are:
- Metamorphosis by Franz Kafka
- Text Cleaning is Task Specific
- Manual Tokenization
- Tokenization and Cleaning with NLTK
- Additional Text Cleaning Considerations
- Tips for Cleaning Text for Word Embedding
This tutorial is about how to chart time series data with line plots and categorical quantities with bar charts. How to summarize data distributions with histograms and box plots. How to summarize the relationship between variables with scatter plots.
The goal of text classification is to automatically classify the text documents into one or more defined categories. In this tutorial, the author will explain about the text classification and the step by step processing to implement it in python.
This tutorial will discuss different feature extraction methods, starting with some basic techniques which will lead into advanced Natural Language Processing techniques. It will also teach pre-processing of the text data in order to extract better features from clean data.