Previous: An Architecture for Autonomous Agents Up: A Computational Model of Color Perception and Color Naming Next: The GLAIR Architecture

Introduction and Overview

In the elephant paper [Brooks 1990] appearing in the proceedings of the predecessor of the current workshop, Brooks criticizes the ungroundedness of traditional symbolic AI systems, and proposes physically grounded systems as an alternative, particularly the subsumption architecture. Subsumption has been highly successful in generating a variety of interesting and seemingly intelligent behaviors in a variety of mobile robots. As such it has established itself as an influential approach to generating complex physical behavior in autonomous agents. In the current paper we explore the possibilities for integrating the old with the new, in an autonomous agent architecture that ranges from physical behavior generation inspired by subsumption to classical knowledge representation and reasoning, and a new proposed level in between the two. Although we are still struggling with many of the issues involved, we believe we can contribute to a solution for some of the problems for both classical systems and physically grounded systems mentioned in [Brooks 1990], in particular:

The ungroundedness of symbolic systems (referred to as ``the symbol grounding problem'' by [Harnad 1990]): our architecture attempts to ground high level symbols in perception and action, through a process of embodiment.
The potential mismatch between symbolic representations and the agent's sensors and actuators: the embodied semantics of our symbols makes sure that this match exists.
Our symbolic representations do not have to be named entities. The knowledge representation and reasoning system we use in our implementations allows the use of unnamed intensional concepts.
We have some ideas about how to automate the construction of behavior generating modules through learning, but much remains to be done.

We agree with the requirement of physically implemented systems as the true test for any autonomous agent architecture, and to this end we are working on several different implementations. We will present both our general multi-level architecture for intelligent autonomous agents with integrated sensory and motor capabilities, GLAIR

, and a physical implementation and two simulation studies of GLAIR-agents.

By an architecture we mean an organization of components of a system, what is integral to the system, and how the various components interact. Which components go into an architecture for an autonomous agent has traditionally depended to a large extent on whether we are building a physical system, understanding/modeling behaviors of an anthropomorphic agent, or integrating a select number of behaviors. The organization of an architecture may also be influenced by whether or not one adopts the modularity assumption of Fodor [Fodor 1983], or a connectionist point of view, e.g. [McClelland et al. 1986], or an anti-modularity assumption as in Brooks's subsumption architecture [Brooks 1985]. The modularity assumption supports (among other things) a division of the mind into a central system, i.e., cognitive processes such as learning, planning, and reasoning, and a peripheral system, i.e., sensory and motor processing [Chapman 1990]. Our architecture is characterized by a three-level organization into a Knowledge level (KL), a Perceptuo-Motor level (PML), and a Sensory-Actuator level (SAL). This organization is neither modular, anti-modular, hierarchical, anti-hierarchical, nor connectionist in the conventional sense. It integrates a traditional symbol system with a physically grounded system, i.e., a behavior-based architecture. The most important difference with a behavior-based architecture like Brooks's subsumption is the presence of three distinct levels with different representations and implementation mechanisms for each, particularly the presence of an explicit Knowledge level. Representation, reasoning (including planning), perception, and generation of behavior are distributed through all three levels. Our architecture is best described using a resolution pyramid metaphor as used in computer vision work [Ballard \& Brown 1982], rather than a central vs. peripheral metaphor.

Architectures for building physical systems, e.g., robotic architectures [Albus et al. 1981], tend to address the relationship between a physical entity, (e.g., a robot), sensors, effectors, and tasks to be accomplished. Since these physical systems are performance centered, they often lack general knowledge representation and reasoning techniques. These architectures tend to be primarily concerned with the body, that is, how to get the physical system to exhibit intelligent behavior through its physical activity. We say these systems are not concerned with consciousness. These architectures address what John Pollock calls Quick and Inflexible (Q&I) processes [Pollock 1989]. We define consciousness for a robotic agent operationally as being aware of one's environment, as evidenced by (1) having some internal states or representations that are causally connected to the environment through perception, (2) being able to reason explicitly about the environment, and (3) being able to communicate with an external agent about the environment.

Architectures for understanding/modeling behaviors of an anthropomorphic agent, e.g., cognitive architectures [Langley et al. 1991][Pollock 1989][Anderson 1983], tend to address the relationships that exist among the structure of memory, reasoning abilities, intelligent behavior, and mental states and experiences. These architectures often do not take the body into account. Instead they primarily focus on the mind and consciousness. Our architecture ranges from general knowledge representation and reasoning to body-dependent physical behavior, and the other way around.

We are interested in autonomous agents that are embedded in a dynamic environment. Such an agent needs to continually interact with and react to its environment and exhibit intelligent behavior through its physical activity. To be successful, the agent needs to reason about events and actions in the abstract as well as in concrete terms. This means combining situated activity with acts based on reasoning about goal-accomplishment, i.e., deliberative acting or planning. In the latter part of this paper, we will present a family of agents based on our architecture. These agents are designed with a robot in mind, but their structure is also akin to anthropomorphic agents. Figure schematically presents our architecture.

There are several features that contribute to the robustness of our architecture. We highlight them below (an in-depth discussion follows later):

We differentiate conscious reasoning from unconscious Perceptuo-Motor and Sensori-Actuator processing.
The levels of our architecture are semi-autonomous and processed in parallel.
Conscious reasoning takes place through explicit knowledge representation and reasoning. Unconscious behavior makes use of several different mechanisms.
Conscious reasoning guides the unconscious behavior, and the unconscious levels, which are constantly engaged in perceptual and motor processing, can alarm the conscious level of important events, taking control if necessary. Control and generation of behavior are layered and not exclusively top-down.
Lower level mechanisms can pre-empt higher level ones. This is kind of subsumption on its head, but everything depends on the placement of behaviors in the hierarchy of course. We haven't quite decided yet whether inhibition should work the other way around as well.
There is a correspondence between terms in the Knowledge Representation and Reasoning (KRR) system on one hand, and sensory perceived objects, properties, events, and states of affairs in the world and motor capabilities on the other hand. We call this correspondence alignment.
Our architecture may be appropriate both for modeling elephants and for modeling chess-playing agents.

lammens@cs.buffalo.edu