Layered Language Understander:

Generic Abductive Inference for Language Processing


by

Kevin Lenzo +

John R. Josephson +*

Christopher Bailey-Kellogg *


June 1995


Abstract

This report describes the design and current implementation of LLU (Layered Language Understander). LLU is application-building software (a "shell") for natural language processing using abduction, but designed to be independent of any particular language or level of processing (syllables, words, sentences, etc.). It is a specialization of a generic abduction mechanism with support for layered abduction and for handling temporally-bounded hypotheses.


+ ATR Interpreting Telecommunications Research Laboratories, Kyoto Japan.

* Laboratory for Artificial Intelligence Research, Department of Computer and Information Science, The Ohio State University, Columbus, Ohio

Introduction

Abduction, or inference to the best explanation, is a pervasive phenomenon. It appears in medical diagnosis, legal reasoning, perception, and hypothesis testing, just to mention a few areas. The first use of the word Abduction is generally ascribed to the American philosopher, Charles Saunders Peirce.

The general pattern of reasoning has been described as follows:

D is a collection of data (facts, observations, givens).

H explains D (would, if true, explain D ).

No other hypothesis can explain D as well as H does.

_______________________________________________

Therefore, H is probably true. (Josephson & Josephson, 1994)

While this description covers a lot of territory, our work focuses on implementing a generic machine that is procedurally abductive, and is applicable to a variety of domains and knowledge sources. In particular, we are currently focusing on LLU, or Layered Language Understander, which is intended to be a generic, layered abduction framework for language processing.

LLU is implemented in C++, and is the sixth in a series of abduction machines (see Josephson & Josephson, ch. 9, 10). It is specifically targeted towards speech recognition and language understanding, and contains vestiges of the previous machines, as well as specific improvements. The first machines were designed for diagnosis, and LLU is a direct descendant of that legacy.

The current implementation runs on a Power Macintosh 8100/80AV, and is compatible with MetroWerks CodeWarrior 6.0. The core code is ANSI compatible, and so is largely portable to other systems.

Abduction

While the general form was given above, abduction itself bears closer examination. The term "best explanation," in particular, needs exploring. We take it that the "best" explanation is one that is parsimonious, confident, and consistent: parsimonious, in the sense that the data is explained with as few hypotheses as possible; confident, in that the component parts surpass competitors to a convincing degree' and consistent, in that the accept parts of a composite explanation to not contradict each other.

During processing, hypotheses go through a cycle of evocation, instantiation, and composition.

Evocation generally occurs bottom-up, as a hypothesis becomes "stimulated" for consideration; however, a hypothesis may also be evoked from above, as an expectation. In LLU, the bottom-up evocation is done using triggers, while the top-down evocation is performed using expectations.

Instantiation is the process by which a hypothesis gains an initial confidence score and determines how much of the data it can account for or cover. It combines a priori probabilities with how well the data "fits" the hypothesis. At instantiation time, no consideration has yet been given to rival hypotheses that may offer to explain or account for some or all of the data covered by the particular hypothesis; it is a logically parallel act.

Composition is the phase in which hypothesis interactions come into play, and, under good conditions, a coherent, "best" explanation emerges. LLU uses a least-commitment strategy to exploit essential hypotheses and make easy decisions first, and propagating the results of acceptance. Initial confidence scores, set at instantiation, become modified as the abducer tentatively accepts the essentials and clear-best explainers, and the interactions between hypotheses are leveraged to dynamically update the confidence scores. This is discussed in more detail later.

Note that abduction, as a logical form, encompasses many hypothesis relation topologies that already exist, such as neural networks (NNs), rule-based resolution systems, hidden Markov models (HMMs), and directed acyclic graphs. Each of these could be a particular instantiation of the truly generic abduction machine.

Figure 1. Neural Net-type topology

In the neural-net type topology (see Figure 1), the abduction machine can implement a feed-forward net by constraining the hypothesis relationships to be those of the appropriate network; i.e., findings (show as F1 through F4) stimulate covering hypotheses ("nodes"), and the covering nodes may have damping relations toward each other. However, transition probabilities are not considered.

Figure 2. HMM-type topology

In the HMM-type topology, the hypothesis under examination (here, H2) depends on the transition probabilities of temporally adjacent hypotheses. Of course, the statistics required must be initialized and updated based on learning to satisfy the requirements of an HMM to actually be an HMM.

In our work, an "abducer" is a software agent that manages problem solving within a defined hypothesis space, known as an agora. An agora may be considered a "marketplace of ideas," where hypotheses gather and contend for the data to be explained. The collection of abducers are, in turn, managed by an agent called an "abducer manager."

The EFLI strategy

the Essentials-First Leveraging-Incompatibility strategy, or EFLI, is used for control of hypothesis assembly (Josephson & Josephson, 1994, ch. 9). EFLI may be briefly described as:

* Find the data with the lowest ambiguity. This may be either a datum with only a unique possible explainer, an "essential hypothesis, "or with one explanation being much better than all others.

* Accept the best explanation for each low-ambiguity data point. That is, make a local, confident abduction.

* Propagate the consequences of acceptance by using known hypothesis relationships. This includes the rejection of incompatible hypotheses which compete with accepted hypotheses, and rescoring those that have expectations towards the accepted hypothesis.

* If necessary, lower the standards for acceptance and continue. (Lowering the standard for acceptance means accepting hypotheses that are best explanations, but surpass their competitors by a degrading margin.)

Please refer to ATR ITL Tech Report TR-IT-0075, "Generic Software for Language Understanding: a design based on layered abduction," for a more complete discussion the LLU design.

Acknowledgments

This work was supported mostly by ATR Interpreting Telecommunications Research Laboratories.

Bibliography

Fujimura, O. (1994). Syllable Timing Computation in the C/D Model. In Proceedings of the 3rd International Conference on Spoken Language Processing, Yokohama Japan. forthcoming.

Josephson, J. R., & Josephson, S. G. (Eds.). (1994). Abductive Inference: Computation, Philosophy, Technology. New York: Cambridge University press.

Josephson, J. R., Smetters, D., Fox, R., Oblinger, D., Welch, A., & Northrup, G. (1989). The Integrated Generic Task Toolset, Fafner Release 1.0. Technical Report, The Ohio State University, Laboratory for Artificial Intelligence Research.

Tanner, M. C., Fox, R., Josephson, J. R., & Goel, A. K. (1991). On a Strategy for Abductive Assembly. In Notes from the AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 80-83). Anaheim. (Josephson & Dasigi, 1991).


WebMaster