Artificial General Intelligence: A Gentle Introduction

Pei Wang

Temple University, Philadelphia, USA

 

This is the outline of a talk given at the "Mind and Machine" Seminar at Shanghai in June 2007.

1. AGI Overview

Historical background

Artificial Intelligence (AI) started with "thinking machine" as the ultimate goal, as documented by the following literature: In the past years, there were some ambitious projects aiming at this goal, though they all failed. The best-known examples include the following ones: Partly due to the realized difficulty of the problem, in the 1970s to 1980s mainstream AI turned to domain-specific problems and special-purpose solutions, though there are opposite attitudes toward this change: Consequently, the field currently called "AI" covers a group of loosely related topics without a common foundation or framework, which causes an "identity crisis":

Recent development

In recent years (since 2004), calls for research on general-purpose systems returned, both inside and outside mainstream AI.

Anniversaries are good time to review the big picture of the field. In the following collections and events, many people raised the topic of general-purpose and human-level intelligence:

More or less coincidentally, there are several recent books with novel plans to AI as a whole (and bold titles), from outside mainstream AI: These two currents are represented at several recent meetings, as well as one in the near future:

What is Artificial General Intelligence (AGI)

AGI research treats "intelligence" as a whole. See Aspects of Artificial General Intelligence for details.

"AI" and "AGI" were originally the same, but currently different. Other similar notions include "strong AI", "human-level intelligence", "real AI", and "thinking machine".

AGI research has a science (theory) aspect and an engineering (technique) aspect. A complete AGI work normally includes

  1. a theory of intelligence,
  2. a formal model of the theory,
  3. a computational implementation of the model.
The current AGI projects are based on very different theories and techniques.

Fundamental AI/AGI questions

At the top level, theoretical questions every AI (AGI) researcher needs to answer include:
  1. What is AI, accurately specified?
  2. Is it possible to build the AI as specified?
  3. If AI is possible, what is the most efficient way to achieve it?
  4. Even if we know how to achieve AI, should we really do it?
Most AI (AGI) researchers answer "Yes" to the 2nd and 4th questions, though some outside people say "No" to one of them.

In the following we will compare the different answers to the 1st and 3rd questions, which are about the research goal and technical strategy of AI (AGI), respectively.

Answers to the 1st question

What is the concrete goal of AI research? Of course, it is "to make computers that are similar to the human mind" --- but in which level of description, generalization, or abstraction should this similarity be obtained? There are different opinions: All these are valid scientific research goals, but they lead to quite different results! See What Do You Mean by "AI"? for details.

Answers to the 3rd question, in AGI context

Though the goal is to get the whole intelligence, each AGI project still needs to divide the problem into subproblems, to be solved one by one. In doing so, each AGI projects follows a different technical strategy or path that roughly belongs to one of the following three types: Common techniques in AGI projects include, though not limited to: Though each of these techniques is also explored in mainstream AI, to use it in a general-purpose system is very different from using it in a special-purpose system.

2. Representative AGI Projects

The following projects are selected, because each of them (1) is clearly oriented to AGI, (2) is still very active, and (3) has ample publication of technical details.

Each project is linked to the project website and selected publications, where the following descriptions are extracted. The focus of the descriptions is on the research goal (the 1st question) and technical path (the 3rd question).

Soar [Unified Theories of Cognition, A Gentle Introduction to Soar]

The ultimate in intelligence would be complete rationality which would imply the ability to use all available knowledge for every task that the system encounters. Unfortunately, the complexity of retrieving relevant knowledge puts this goal out of reach as the body of knowledge increases, the tasks are made more diverse, and the requirements in system response time more stringent. The best that can be obtained currently is an approximation of complete rationality. The design of Soar can be seen as an investigation of one such approximation.

For many years, a secondary principle has been that the number of distinct architectural mechanisms should be minimized. Through Soar 8, there has been a single framework for all tasks and subtasks (problem spaces), a single representation of permanent knowledge (productions), a single representation of temporary knowledge (objects with attributes and values), a single mechanism for generating goals (automatic subgoaling), and a single learning mechanism (chunking). We have revisited this assumption as we attempt to ensure that all available knowledge can be captured at runtime without disrupting task performance. This is leading to multiple learning mechanisms (chunking, reinforcement learning, episodic learning, and semantic learning), and multiple representations of long-term knowledge (productions for procedural knowledge, semantic memory, and episodic memory).

Two additional principles that guide the design of Soar are functionality and performance. Functionality involves ensuring that Soar has all of the primitive capabilities necessary to realize the complete suite of cognitive capabilities used by humans, including, but not limited to reactive decision making, situational awareness, deliberate reasoning and comprehension, planning, and all forms of learning. Performance involves ensuring that there are computationally efficient algorithms for performing the primitive operations in Soar, from retrieving knowledge from long-term memories, to making decisions, to acquiring and storing new knowledge.

ACT-R [The Atomic Components of Thought, An Integrated Theory of the Mind]

ACT-R is a cognitive architecture: a theory for simulating and understanding human cognition. Researchers working on ACT-R strive to understand how people organize knowledge and produce intelligent behavior. As the research continues, ACT-R evolves ever closer into a system which can perform the full range of human cognitive tasks: capturing in great detail the way we perceive, think about, and act on the world.

On the exterior, ACT-R looks like a programming language; however, its constructs reflect assumptions about human cognition. These assumptions are based on numerous facts derived from psychology experiments. Like a programming language, ACT-R is a framework: for different tasks (e.g., Tower of Hanoi, memory for text or for list of words, language comprehension, communication, aircraft controlling), researchers create models (aka programs) that are written in ACT-R and that, beside incorporating the ACT-R's view of cognition, add their own assumptions about the particular task. These assumptions can be tested by comparing the results of the model with the results of people doing the same tasks.

ACT-R is a hybrid cognitive architecture. Its symbolic structure is a production system; the subsymbolic structure is represented by a set of massively parallel processes that can be summarized by a number of mathematical equations. The subsymbolic equations control many of the symbolic processes. For instance, if several productions match the state of the buffers, a subsymbolic utility equation estimates the relative cost and benefit associated with each production and decides to select for execution the production with the highest utility. Similarly, whether (or how fast) a fact can be retrieved from declarative memory depends on subsymbolic retrieval equations, which take into account the context and the history of usage of that fact. Subsymbolic mechanisms are also responsible for most learning processes in ACT-R.

Polyscheme [Polyscheme, Adaptive Algorithmic Hybrids for Human-Level Artificial Intelligence]

Polyscheme is a cognitive architecture designed to model and achieve human-level intelligence by integrating multiple methods of representation, reasoning and problem solving.

A system will be said to have human-level intelligence if it can solve the same kinds of problems and make the same kinds of inferences that humans can, even though it might not use mechanisms similar to those humans in the human brain. The modifier "human-level" is intended to differentiate such systems from artificial intelligence systems that excel in some relatively narrow realm, but do not exhibit the wide-ranging cognitive abilities that humans do.

A key insight ... is that AI algorithms from different subfields based on different computational formalisms can all be conceived of as strategies guiding attention through propositions in the multiverse [the set of all possible worlds].

LIDA [The Lida Architecture, A Cognitive Theory of Everything]

Implementing and fleshing out a number of psychological and neuroscience theories of cognition, the LIDA conceptual model aims at being a cognitive "theory of everything." With modules or processes for perception, working memory, episodic memories, "consciousness," procedural memory, action selection, perceptual learning, episodic learning, deliberation, volition, and non-routine problem solving, the LIDA model is ideally suited to provide a working ontology that would allow for the discussion, design, and comparison of AGI systems. The LIDA technology is based on the LIDA cognitive cycle, a sort of "cognitive atom." The more elementary cognitive modules play a role in each cognitive cycle. Higher-level processes are performed over multiple cycles.

The LIDA architecture represents perceptual entities, objects, categories, relations, etc., using nodes and links .... These serve as perceptual symbols acting as the common currency for information throughout the various modules of the LIDA architecture.

SNePS [SNePS: A Logic for Natural Language Understanding and Commonsense Reasoning, Metacognition in SNePS]

The long term goal of the SNePS Research Group is to understand the nature of intelligent cognitive processes by developing and experimenting with computational cognitive agents that are able to use and understand natural language, reason, act, and solve problems in a wide variety of domains.

The SNePS knowledge representation, reasoning, and acting system has several features that facilitate metacognition in SNePS-based agents. The most prominent is the fact that propositions are represented in SNePS as terms rather than as logical sentences. The effect is that propositions can occur as arguments of propositions, acts, and policies without limit, and without leaving first-order logic.

Cyc [Building Large Knowledge-Based Systems, Common Sense Reasoning]

Vast amounts of commonsense knowledge, representing human consensus reality, would need to be encoded to produce a general AI system. In order to mimic human reasoning, Cyc would require background knowledge regarding science, society and culture, climate and weather, money and financial systems, health care, history, politics, and many other domains of human experience. The Cyc Project team expected to encode at least a million facts spanning these and many other topic areas.

The Cyc knowledge base (KB) is a formalized representation of a vast quantity of fundamental human knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and events of everyday life. The medium of representation is the formal language CycL. The KB consists of terms -- which constitute the vocabulary of CycL -- and assertions which relate those terms. These assertions include both simple ground assertions and rules.

AIXI [Universal Artificial Intelligence, Universal Algorithmic Intelligence: A mathematical top->down approach]

An important observation is that most, if not all known facets of intelligence can be formulated as goal driven or, more precisely, as maximizing some utility function.

Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible.

The major drawback of the AIXI model is that it is uncomputable, ... which makes an implementation impossible. To overcome this problem, we constructed a modified model AIXItl, which is still effectively more intelligent than any other time t and length l bounded algorithm.

OSCAR [Thinking about Acting: Logical Foundations for Rational Decision Making, Rational Cognition in OSCAR]

OSCAR is based on a schematic view of rational cognition according to which agents have beliefs representing their environment and an evaluative mechanism that evaluates the world as represented by their beliefs. They then engage in activity designed to make the world more to their liking.

The principal virtue of OSCAR�s epistemic reasoning is not that it is an efficient deductive reasoner, but that it is capable of performing defeasible reasoning. Deductive reasoning guarantees the truth of the conclusion given the truth of the premises. Defeasible reasoning makes it reasonable to accept the conclusion, but does not provide an irrevocable guarantee of its truth. Conclusions supported defeasibly might have to be withdrawn later in the face of new information.

NARS [Rigid Flexibility: The Logic of Intelligence, From NARS to a Thinking Machine]

What makes NARS different from conventional reasoning systems is its ability to learn from its experience and to work with insufficient knowledge and resources. NARS attempts to uniformly explain and reproduce many cognitive facilities, including reasoning, learning, planning, etc, so as to provide a unified theory, model, and system for AI as a whole. The ultimate goal of this research is to build a thinking machine.

The development of NARS takes an incremental approach consisting four major stages. At each stage, the logic is extended to give the system a more expressive language, a richer semantics, and a larger set of inference rules; the memory and control mechanism are then adjusted accordingly to support the new logic.

In NARS the notion of "reasoning" is extended to represent a system's ability to predict the future according to the past, and to satisfy the unlimited resources demands using the limited resources supply, by flexibly combining justifiable micro steps into macro behaviors in a domain-independent manner.

Novamente [The Hidden Pattern: A Patternist Philosophy of Mind, An Integrative Architecture for General Intelligence]

Novamente incorporates aspects of many previous AI paradigms such as agent systems, evolutionary programming, reinforcement learning, automated theorem-proving, and probabilistic reasoning. However, it is unique in its overall architecture, which confronts the problem of creating a holistic digital mind in a direct way that has not been done before.

General Intelligence is the ability to achieve complex goals in complex environments.

Novamente essentially consists of a framework for tightly integrating various AI algorithms in the context of a highly flexible common knowledge representation, and a specific assemblage of AI algorithms created or tweaked for tight integration in an integrative AGI context.

Cog [Alternative Essences of Intelligence, The Cog Project: Building a Humanoid Robot]

We believe that human intelligence is a direct result of four intertwined attributes: developmental organization, social interaction, embodiment and physical coupling, and multimodal integration. Development forms the framework by which humans successfully acquire increasingly more complex skills and competencies. Social interaction allows humans to exploit other humans for assistance, teaching, and knowledge. Embodiment and physical coupling allow humans to use the world itself as a tool for organizing and manipulating knowledge. Integration allows humans to maximize the efficacy and accuracy of complementary sensory and motor systems.

Avoiding flighty anthropomorphism, you can consider Cog to be a set of sensors and actuators which tries to approximate the sensory and motor dynamics of a human body. Except for legs and a flexible spine, the major degrees of motor freedom in the trunk, head, and arms are all there. Sight exists, in the form of video cameras. Hearing and touch are on the drawing board. Proprioception in the form of joint position and torque is already in place; a vestibular system is on the way. Hands are being built as you read this, and a system for vocalization is also in the works. Cog is a single hardware platform which seeks to bring together each of the many subfields of Artificial Intelligence into one unified, coherent, functional whole.

CAM-Brain [Artificial Brains, The CAM-Brain Machine (CBM)]

An artificial brain is defined to be a collection of interconnected neural net modules (10,000-50,000 of them), each of which is evolved quickly in special electronic hardware, downloaded into a PC, and interconnected according to the designs of human BAs (brain architects). The neural signaling of the artificial brain (A-Brain) is performed by the PC in real time (defined to be 25Hz per neuron). Such artificial brains can be used for many purposes, e.g. controlling the behaviors of autonomous robots.

Neural networks are based on cellular automata, and are evolved using a Genetic Algorithm (GA) at electronic speeds using the latest in FPGAs (field programmable gate arrays)... . CA based neural circuits can be grown and evaluate totally in hardware in microseconds, making possible a complete run of a GA (i.e. tens of thousands of circuit growths and evaluations (fitness measurements)) in less than a second.

Up to 64,000 evolved neural net modules can be assembled into humanly designed artificial brain architectures, and each CA cell in the whole brain of millions of cells (stored in RAM) can be updated (using the CBM) thousands of times a second, which is easily fast enough for real time control of robots.

HTM [On Intelligence, Hierarchical Temporal Memory]

The brain uses vast amounts of memory to create a model of the world. Everything you know and have learned is stored in this model. The brain uses this memory-based model to make continuous predictions of future events. It is the ability to make predictions about the future that is the crux of intelligence.

Hierarchical Temporal Memory (HTM) is a technology that replicates the structural and algorithmic properties of the neocortex. HTM therefore offers the promise of building machines that approach or exceed human level performance for many cognitive tasks.

HTMs are organized as a tree-shaped hierarchy of nodes, where each node implements a common learning and memory function. HTMs store information throughout the hierarchy in a way that models the world. All objects in the world, be they cars, people, buildings, speech, or the flow of information across a computer network, have structure. This structure is hierarchical in both space and time. HTM memory is also hierarchical in both space and time, and therefore can efficiently capture and model the structure of the world.

A rough classification

The above AGI projects are roughly classified in the following table, according to the type of their answers to the previously listed 1st question (on research goal) and 3rd question (on technical path).

goal \ path hybrid integrated unified
principle     AIXI, NARS, OSCAR
function   LIDA, Novamente, Polyscheme Cog, SNePS, Soar
capability     Cyc
behavior     ACT-R
structure     CAM-Brain, HTM

Since this classification is made at a high level, projects in the same entry of the table are still quite different in the details of their research goals and technical paths.