Previous: From Color Space to Color Names Up: From Color Space to Color Names Next: Locating the Berlin and Kay Color Stimuli in the Color Space

The Normalized Gaussian Category Model

[Shepard 1987] provides an elegant argument for his claim that the probability of generalization of an existing (known) category to a new (unknown) stimulus is a monotonic function of the normalized distance in psychological space of the unknown stimulus to known stimuli belonging to the category. He further specifies that this function can be approximated by a simple exponential decay or, under certain circumstances, a Gaussian function. The distance metric is either Euclidean, resulting in circular (or spheroid) contours of equal generalization, or a slight variant that results in elliptic (or ellipsoid) contours of equal generalization.

Following Shepard's suggestion, I have used a variant of the Gaussian normal distribution as a category model, which I will refer to as the normalized Gaussian model. The usual normal curve in one variable (as used in probability theory) is given by

where is the standard deviation (determining the ``width'' of the curve), and is the mean or expected value (determining the location of the maximum) (Fig. ).

Since the normal curve is used as a probability density function, it has the special property that . The term in equation is the Euclidean distance of the one-dimensional point to the mean . To derive the normalized Gaussian function, we drop the factor , since we don't need the interpretation as a probability density function, and we substitute the general N-dimensional Euclidean distance function for the distance term, which gives us

with symbols as in equation . An example of a two-dimensional version of this function is shown in Figure .

This function has a number of interesting properties for use as a basic color category model:

The maximum value occurs at and is unity, regardless of the value of .
The ``width'' of the curve is a function of the parameter only.
The value decreases monotonically (but not linearly) as a function of the distance to .
It is strictly positive (non-zero) everywhere except at infinite distance from .
It has a simple mathematical definition, using only two parameters and .

These properties allow us to interpret the normalized Gaussian function as a category model, with

interpreted as the location of the center or focus of the category, and the function value interpreted as the goodness value (or alternatively, the fuzzy membership value) of a stimulus represented by its color space coordinates. By modulating the value of

we can affect the size of the volume of the color space that is included in the category, relative to some threshold function value. The maximum goodness (or membership) value is unity, as is common in fuzzy set models, but we can get a goodness value for any point in the color space, no matter how far removed from the focus. This is important for our purpose, as I will show below. The non-linear decrease of goodness with distance from the focus reflects the general shape of psychological categories as discussed by [Shepard 1987], and it can result in category extensions different from those obtained with a simple nearest-neighbor criterion, since neighboring categories may have different values of

. The representation is also economical, since a category can be represented by only two parameters.

lammens@cs.buffalo.edu