XTRACT was a lexical collocation tool developed by Frank Smadja in the early 1990s that used statistical techniques for retrieving and identifying collocations in a large textual corpora (Smadja, "XTRACT" 399).
URICA! II was collation tool for microcomputers, and an updated version of URICA! (User Response Interactive Collation Assistant) (Hilton 139.) This update was released in 1992 by Michael Hilton and his team (139).
COCOA was a program for creating KWIC concordances, word frequency reports and word counts on texts (Day and Marriott 56).
COCOA was first available in the early to mid 1960s. D. B. Russell's 1965 review describes the earliest version as "a system which allows users to generate word-counts and concordances from literary (or other) texts. It was written originally for Atlas after consultation with various British Universities, and is currently being implemented for System 4-75 at Edinburgh" (Russell).
This document describes the logic followed by HyperPo for dealing with various character encodings in documents that can be submitted as a string in a query field, retrieved from a remote location, or uploaded as a file. I'll avoid getting into any HyperPo-specific implementation details so that this may be of use to similar tools.