This is a recipe for looking at the changes that have taken place in a wikipedia article over time, and generating a corpus of the different edited versions.

  • A wikipedia article, with URL
  • A program that can go through the article's history and aggregate it, like WIScker
  1. Input the URL of the wikipedia article into WIScker. Select the dates and frequency that you want WIScker to search through.
  2. Choose the output format. "XML" gives you the page with XML tags, while "Human Readable" gives you a more easily readable format. "Plain text" returns the text without any of the tags that wikipedia uses, while "Wikipedia Formatting" uses all of those tags. Depending on what further analysis you want to do, any of these options may be the correct choice.
  3. Choose "Select Text" and copy your text into the medium you wish to save it in.
TaDiRAH goals/methods