Study Vocabulary Change over Time in an Existing Corpus

Introduction 

This recipe uses Voyant Tools to explore vocabulary change over time in a corpus of 7 Canadian Throne Speeches (from 2006 to 2013).

Ingredients 
  • A corpus of seven Canadian Throne speeches spanning over seven years (the speeches were compiled from a Canadian parliamanetary site – only the filenames were modified to better indicate the year and session of the speech)
  • A general-purpose text analysis suite like Voyant Tools

 Questions

  1. What terms occur consistently across the seven years of the corpus?
  2. What document is most unusual?
  3. What terms define the most recent document?
Steps 

  1. Access the corpus (with English stopword list enabled) in the default Voyant Tools skin 
  2. In the lower left-hand corner, open the "Words in the Entire Corpus" panel
  3. Examine the "Trends" column which shows sparklines of relative term frequency in each document of the corpus
  4. Select individual terms in "Words in the Entire Corpus" by clicking on the row or select multiple terms by clicking on the checkbox in the leftmost column
  5. In the "Words in the Entire Corpus" panel, hover over the column label next to "Count" and click on the down arrow that appears in the rightmost part of the column header. This will produce a drop-down list. Hover over "Columns" on the drop-down list and select "Std. Dev." This will add those standard deviation values to the table. (Lower standard deviations indicate less variability in relative frequency.) Hover over the column header to see the description of "Std. Dev."
  6. In the "Summary" panel (the middle panel on the left), open all the "Distinctive Words." To open all instances, click on "Next 2 of 2 remaining."
  7. Scan the list visually, do any of the documents stand out as being different?
  8. Click on the "more" link for the last document entitled "2013.10.1"
  9. In the "Words in Documents" panel (the bottom right-hand corner) use the technique described in step 5 to add the "Relative Difference" column to the table. Hover over the column header to see the description of the values.
  10. Click on the Relative Difference header to sort values by descending order (the first time you click you may have them in ascending order)
  11. Click on the word "natural" in the "Words in Documents" panel. In the "Keywords in Context"panel (just above), examine each occurrence of the word. How is the word natural used?