Soderland, S. and Lehnert. W. (1994)
"Wrap-Up: a Trainable Discourse Module for Information Extraction",
Volume 2, pages 131-158.
Abstract: The vast amounts of on-line text now available have led
to
renewed interest in information extraction (IE) systems that
analyze
unrestricted text, producing a structured representation of
selected
information from the text. This paper presents a novel approach
that
uses machine learning to acquire knowledge for some of the higher
level IE processing. Wrap-Up is a trainable IE discourse component
that makes intersentential inferences and identifies logical
relations
among information extracted from the text. Previous corpus-based
approaches were limited to lower level processing such as
part-of-speech tagging, lexical disambiguation, and dictionary
construction. Wrap-Up is fully trainable, and not only
automatically
decides what classifiers are needed, but even derives the feature
set
for each classifier automatically. Performance equals that of a
partially trainable discourse module requiring manual customization
for each domain.
Click here to return to the JAIR home page.