This lesson is a brief introduction to string manipulation techniques in Python.
In this lesson, you will learn the Python commands needed to implement the second part of the algorithm begun in the lesson ‘From HTML to a List of Words (part 1)’.
In this two-part lesson, we will build on what you’ve learned about Downloading Web Pages with Python, learning how to remove the HTML markup from the webpage of Benjamin Bowsey’s 1780 criminal trial transcript. We will achieve this by using a variety of string operators, string methods, and close reading skills. We introduce looping and branching so that programs can repeat tasks and test for certain conditions, making it possible to separate the content from the HTML tags. Finally, we convert content from a long string to a list of words that can later be sorted, indexed, and counted.
This tutorial teaches you how to count the frequency of specific words in a list that can provide illustrative data.
This is an tutorial about how to use the voyant tool for text analysis. The tutorial starts with very basic function of voyant tools and then go deeper for the function that can be used in text analysis-using knot to visualize the text.
This tutorial is divided into 6 parts; they are:
- Metamorphosis by Franz Kafka
- Text Cleaning is Task Specific
- Manual Tokenization
- Tokenization and Cleaning with NLTK
- Additional Text Cleaning Considerations
- Tips for Cleaning Text for Word Embedding
This tutorial is about how to chart time series data with line plots and categorical quantities with bar charts. How to summarize data distributions with histograms and box plots. How to summarize the relationship between variables with scatter plots.
The goal of text classification is to automatically classify the text documents into one or more defined categories. In this tutorial, the author will explain about the text classification and the step by step processing to implement it in python.
This tutorial will discuss different feature extraction methods, starting with some basic techniques which will lead into advanced Natural Language Processing techniques. It will also teach pre-processing of the text data in order to extract better features from clean data.
This tutorial is about the basic of python and working with text files to compute something interesting.The first set of tutorial designed to teach the basic programming knowledge of using python to analyze text data.The second set of tutorial the author compute the proportion of positive words in tweet after cleaning up the data a bit.The third set of tutorial expand the code written in the previous two, to explore the positive and negative sentiment of any set of text.