Previous: Embodiment and Symbol Grounding Up: Related Work Next: Neurophysiology of Color Vision

Basic Color Terms

The notion of basic color terms originates in the anthropological and linguistic work of Brent Berlin and Paul Kay in the late sixties [Berlin \& Kay 1969]. They studied color naming behavior with native speakers of a variety of languages, and the existing literature on the subject, and recorded two main findings:

  1. There are substantial universal constraints on the shape of basic color lexicons.

  2. Basic color lexicons change over time by adding basic color terms in a highly constrained, though not mechanically predictable, manner.
The research was carried out against a background of extreme linguistic relativism, also known as the Sapir-Whorf hypothesis [Kay \& Kempton 1984], which holds that each language performs the coding of experience into language in a unique and arbitrary manner and that there are no semantic universals (principles of meaning that hold across all languages) in principle.

Basic color terms are defined as having the following characteristics:

  1. They are ``monolexemic''; i.e., their meaning is not predictable from the meaning of their parts (for English, the following do not qualify: bluish, lemon-colored, but blue and yellow do).

  2. Their ``signification'' is not included in that of any other color term (for English, crimson or scarlet do not qualify, as they are both included in red, but red does).

  3. Their application is not restricted to a narrow class of objects (for English, blond does not qualify, but brown does).

  4. They are psychologically salient for informants, which is apparent from occurring at the beginning of elicited lists of color terms, stability of reference across informants and occasions of use, and occurrence in idiolects of all informants (for English, red, green, blue, yellow are good candidates, but chartreuse is not).
Some additional criteria are provided for borderline cases.

A set of 329 color chips mounted on a piece of cardboard was used as stimulus material. The color chips were selected using the Munsell color system (Section ), and represented 40 equally spaced hues 8 degrees of brightness, plus 9 neutral hues ranging from white through grey to black (Figure ).

A constant light source was used to illuminate the chips. After the basic color terms of the language in question had been elicited, subjects were instructed to indicate the focal point (best example) of each basic color category and its outer boundary in the set of color chips.

Berlin and Kay found that the foci of basic color terms are similar across totally unrelated languages, and that they cluster into discrete, contiguous areas in the color space. The boundaries between color categories, on the other hand, were found to be variable across languages and even for repeated trials with the same informants. They speculate that unless the effect is due to the experimental procedure,

it is possible that the brain's primary storage procedure for the physical reference of color categories is concerned with points (or very small volumes) of the color solid rather than extended volumes. Secondary processes, of lower salience and inter-subjective homogeneity, would then account for the extensions of reference to points in the color solid not equivalent to (or included in) the focus. (p. 13)

This suggestion is taken up almost 20 years later by [Lakoff 1987], who considers color to be an example of a ``generative cognitive category'', one that consists of a focus (or foci) combined with a ``complex cognitive mechanism'' that generates other members of the category from the focus in a consistent but not a priori predictable manner.

Berlin and Kay also found that although different languages encode different numbers of basic color categories in their vocabularies, a total universal inventory of exactly eleven basic color categories exists, from which the eleven or fewer basic color terms of any given language are always drawn. The English terms corresponding to the eleven categories are

white, black, red, green, yellow, blue, brown, purple, pink, orange, grey.
In addition, they found that if a language encodes fewer than eleven basic color categories, then there exist strict limitations on which categories it will encode. The distributional restrictions across languages can be summarized as a sequence of ``evolutionary stages'':
  1. All languages have terms for white, black (or more accurately, for light-warm and dark-cool colors).

  2. If a language encodes 3 categories, it contains a term for red, in addition to the terms encoded in the first stage.

  3. If a language encodes 4 categories, it contains a term for green or a term for yellow, in addition to the terms encoded in the previous stage.

  4. If a language encodes 5 categories, it contains terms for green and yellow, in addition to the terms encoded in the previous stage.

  5. If a language encodes 6 categories, it contains a term for blue, in addition to the terms encoded in the previous stage.

  6. If a language encodes 7 categories, it contains a term for brown, in addition to the terms encoded in the previous stage.

  7. If a language encodes 8 or more categories, it contains terms for purple, pink, orange, grey, or some combination of these, in addition to the terms encoded in the previous stage.
This sequence of stages defines a partial order on the set of basic color categories with six equivalence classes, which may be represented as follows: where means that equivalence class is present in every language in which any element of equivalence class is present.

Berlin and Kay interpret this sequence as reflecting both distributional facts and a chronological order of lexical encoding of basic color terms in each language. This chronological order is in turn interpreted as a sequence of evolutionary stages. More recent work has revised the evolutionary sequence somewhat, e.g. [Kay et al. 1991]. Although Berlin and Kay speculate that there is a correlation between the general cultural and technological complexity of societies and the complexity of the color vocabulary (Figure ), they also admit that

... the particular order in which color foci universally became encoded in individual lexicons ... is a difficult problem which is only vaguely understood at this time. (p. 17)

One may wonder why English, of all languages, turns out to have the most (11) basic color categories, and thus sits at the top of the complexity hierarchy with respect to color vocabulary. Other than reflecting the ``general cultural and technological complexity of society'', there might be an unintended bias in the experiment or the interpretation of the data. For example, it might be difficult to recognize a basic color term in another language for which there is no corresponding basic color term in English. Nothing in the data seems to indicate that the 11 terms of English is a theoretical maximum. Indeed, Cairo [Cairo 1977] predicts some additional potential basic color terms which have not been lexicalized in English (yet?), e.g. something akin to ``khaki''.

Berlin and Kay seem to be concerned with issues of symbol grounding and embodiment too, although they don't use those terms of course (their work predates the symbol grounding and embodiment work considerably):

The study of the biological foundations of the most peculiarly and exclusively human set of behavioral abilities - language - is just beginning ..., but sufficient evidence has already accumulated to show that such connections must exist for the linguistic realms of syntax and phonology. The findings reported here concerning the universality and evolution of basic color lexicon sic suggest that such connections are also to be found in the realm of semantics. (p. 110)
The research presented in this dissertation can be seen as a formal investigation into such connections.

Berlin and Kay's findings have been corroborated in numerous other studies since (see the bibliography in [Berlin \& Kay 1969]). Boynton, for example, reports that other research has shown that basic color terms are listed first and used more reliably, with greater consensus and shorter response times than any other color terms [Boynton 1990][p. 240]. They also translate easily between languages and are commonly learned by the age of five, at which time very few non-basic color terms are used. None of these special attributes apply to non-basic color terms. Although Berlin and Kay discount the possibility that the boundary effect is due to their experimental procedure, Kay and McDaniel [Kay \& McDaniel 1978] seem to be less convinced, as I will discuss below. Cairo [Cairo 1977] also has some doubts about their experimental procedure.

lammens@cs.buffalo.edu