CSE 473 Autumn 1998, Copyright, S. Tanimoto, Univ. of Washington 
Introduction to Artificial Intelligence (Nov 23, 1998)

"Introduction to Natural Language Understanding"

Motivation:
 

1. More natural human/computer interfaces.
 "Find me a good deal on new bicycle at a dealer near my home."

2. Enable computers to learn from books and audio.
  "Columbus sailed the ocean blue in the year 1492."

3. Intelligent aids to communication.
  "Merci beaucoup." -->  "Thank you very much"

4. Content analysis: computer programs that summarize documents and make notes of significant points.

5. Intelligent search for information on the web.
 
 
 

Stages in NLU

Levels of analysis for NLU:

Signal
Phoneme
Lexical
Syntax
Semantic
Speech Act (Pragmatic)
   Also...Dialog

Speech understanding deals with all levels, including signal and phoneme levels.

Text understanding begins at the lexical level.
 
 

Aspects of Communication with Language
 

Language is one-dimensional (linear), but is used to describe multidimensional situations and events.

People seek economy of expression, often at the expense of ambiguity.  In fact, ambiguity is a pervasive aspect of NLU.

Ambiguity is often resolved using "context" -- knowledge about the situation.
 

Syntax

A language (from a formal syntactic point of view) is a set of strings over a given finite alphabet.

In order to provide a way to map strings into meanings, there must be a way to group elements of a string into units and phrases.  This is usually done by means of grammars -- sets of rules that are described according to string transformations based on replacements of symbols or substrings by other substrings.

The Chomsky hierarchy gives a well-known taxonomy for classes of languages described by grammars.

The process of mapping a sentence in a language into a syntactic description according to a grammar is called parsing.

The most important classes of languages for NLU are context-free and context-sensitive.  The job of building grammars that can handle English in a general way is a big challenge.   Typical "industrial" grammars have hundreds of production rules.  Boeing uses a large parser to help improve the technical writing in airplane user manuals.

Because methods for syntax analysis tend to be simpler than those for semantic analysis, the fuzzy line between syntax and semantics has sometimes been pushed towards the semantics side in order to perform more of the processing with parsing technology.  For example, "semantic grammars" use standard syntactic mechanisms to process phrases into semantically specific categories.
 
 
 
 


 
 

Last modified: November 23, 1998

Steve Tanimoto

tanimoto@cs.washington.edu