Az Eszterházy Károly Tanárképző Főiskola Tudományos Közleményei. 1996. Vol. 1. Eger Journal of English Studies.(Acta Academiae Paedagogicae Agriensis : Nova series ; Tom. 24)
Ramesh Krishnamurthy: Change and continuity at COBUILD (1986-1996)
The top 8 items are exactly the same, except that 'to' and 'and' have changed position. Similarly, the replacement of 'i' and 'was' by 'is' and 'for' is not very significant, and merely represents minor changes of position. However, even here, a word of caution is required. As we shall see below, even frequent words must be constantly reviewed, or subtle changes in usage will be missed. Most wordforms lower down the frequency order, for which the 20 million word corpus gave insufficient information, are usually much better attested in the 211 million word corpus, and more precise and detailed information can be obtained from the larger corpus. Very low frequency items are of little use in any corpus, as they do not provide enough evidence on which to base any lexicographic statements (for example, about half of the wordforms in a corpus occur only once: i.e. over 125,000 wordforms in the 20 million word corpus, and over 250,000 in the 211 million word corpus). 3 Corpus Access And Retrieval 3.1 History This area has changed most radically in the past decade. Until 1985, we looked at paper printouts of concordances from the 7.3 million word corpus. This meant that only one person could inspect a particular set of concordances at any given time, and that concordances could be easily misplaced, lost, or damaged. Each page of printout contained 56 concordance lines with a context of 100 characters that were sorted to the right of the keyword, with text references at the left of each line. 1 This gave us reasonable information about the immediate right-hand collocations (nouns modified by an adjective keyword, prepositions governed by verb keywords, etc), but non-adjacent collocations were difficult to spot, and left-hand collocations were even more difficult to identify. If a particular concordance line contained insufficient context, the relevant context could only be obtained by tracking down concordances for other words in the line, a time-consuming affair 1 Looking Up. Ed. by J. M. Sinclair, HarperCollins Publishers, London, 1987. p. 36. 64