Corpora
Concentrating on the rare words was not my only childhood
mistake when I attempted to create my own dictionary. A more
serious one was that I arrived at my definitions by looking at
the definitions of other dictionaries. I reworded them of
course even at the tender age of 12, I was intuitively aware
of the dangers of plagiarism but I saw my role as one of
collating the wisdom of previous lexicographers. Of course
that way there is room for little new wisdom.
Perhaps shockingly, until 20 years ago my practice would
not have been out of place in many dictionary teams.
Lexicographers would draw on a mixture of previous practice,
intuitions, and half-remembered examples, supported by chance
encounters with the word in print. With the advent of large
corpora and the development of powerful computer software
capable of exploring those corpora, dictionary-making has
changed beyond all recognition. The lexicographers who worked
on the Macmillan English Dictionary had the opportunity
of examining hundreds and in some cases thousands of instances
of a word in use. From these instances they could work out
what a word really meant in contemporary English, rather than
what it was supposed to mean.
Take the example I gave above of the use of conventional
in the phrase conventional oven. It may seem obvious
that a conventional oven is one that is not a microwave
oven, but it only seems obvious once it has been pointed out.
If the lexicographers who worked on this dictionary had relied
on intuition, they might easily have forgotten this use of the
word, and of course if they had relied on previous
dictionaries they could easily have missed it because many of
those dictionaries were prepared before the microwave oven
came into popular use.

|