TATR: Panda Dataframe Manipulation

Introduction 

This recipe is part of the Text Analysis for Twitter Research (TATR) series. This recipe will describe Panda dataframe manipulation, in particular the techniques used for some of the more advanced Twitter analysis found in the TATR library.

Ingredients 
Steps 
  • Open a new Jupyter Notebook and import the following libraries:
    • PANDAS
    • NUMPY
  • Import Twitter data
  • Create 8 entries, and assign the following 3 values to each:
    • Date
    • Hashtag_count
    • Mention_count
  • Replace the index column with the Tweet date
  • Combine all of the values of the same date together
  • Add the counts of each entry to the dataframe
Discussion 

The TATR library was presented as an academic poster in 2018’s Congress held in Regina, SK. For a PDF version of the full poster, please visit:

Next steps / further information 

Certain aspects of this recipe draw upon code from the companion TATR notebooks and recipes. In particular, please see:

TATR: Tokenization and Extraction

This recipe describes components that are fundamental for some of the more advanced TATR notebooks.

Status