Efficient Content-Based Image Retrieval
PI: Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington
Contact Information
Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington
PO Box 352350
Seattle, WA 98195-2350
Phone: (206) 543-2196
Fax : (206) 543-2969
Email: shapiro@cs.washington.edu
WWW PAGE
Project URL
List of Supported Students
- Andrew Berman
- Yu-Yu Chou
- Yi Li
Project Award Information
IRI-9711771
09/15/1997 -- 08/31/2000
Efficient Content-Based Image Retrieval
Keywords
content-based image retrieval, image database, image indexing, image matching,
distance measures
Project Summary
The focus of our work is the development of a general, scalable architecture
to support fast querying of very large image databases with user-specified
distance measures. We have developed algorithms and data structures
for efficient
image retrieval from large databases with multiple distance measures.
We are investigating methods for merging our general,
distance-measure-independent
method with other useful techniques that may be distance measure specific,
such as keyword retrieval and relational indexing.
We are developing both new methods for combining
distance measures and a framework in which users can specify their queries
without detailed knowledge of the underlying metrics. We have built a prototype
system to test our methods and evaluated it on both a large general image
database and a smaller controlled database.
Publications and Products
A. Berman and L. G. Shapiro. "Efficient image retrieval with multiple
distance measures." Proceedings of the SPIE Conference on Storage and
Retrieval for Image and Video Databases, February, 1997.
A. P. Berman and L. G. Shapiro. "Selecting good keys for triangle-inequality-based
pruning algorithms." IEEE International Workshop on Content-Based Access
of Image and Video Databases, January 1998.
A. P. Berman and L. G. Shapiro. "A Flexible Image Database System for Content-Based Retrieval."
17th International Conference on Pattern Recognition, August, 1998.
A. P. Berman and L. G. Shapiro. "Triangle-Inequality-Based Pruning Algorithms
with Triangle Tries."Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases, January, 1999.
A. P. Berman and L. G. Shapiro, "A Flexible Image Database System
for Content-Based Retrieval," Computer Vision
and Image Understanding, Vol. 75, Nos. 1-2, 1999, pp. 175-195.
A. P. Berman and L. G. Shapiro, "Efficient Content-Based Retrieval:
Experimental Results," Proceedings of the IEEE Workshop on
Content-Based Access of Image and Video Databases, June 1999, pp. 55-61.
Demo Flexible Image
Database System Using the Groundtruth Database
Groundtruth Database
Goals, Objectives, and Targeted Activities
In the first two years of the project, we developed
a prototype system, FIDS, with which the user can conduct
searches using complex combinations of several dozen distance measures.
We incorporated a new set of data structures and algorithms into the
latest version of FIDS that has led to a marked speed up of the elimination
phase of the search in large image databases.
Currently, FIDS contains over 37,000 images. A complex search through
the database can be performed in just a few seconds.
Furthermore, we have created a ground-truth
database that currently contains 11 datasets of about 48 images each,
plus an ASCI file for each dataset that lists the names of the objects
that appear in each image. Our web demo system uses this groundtruth
database, to which both we and other researchers are adding data and
associated descriptions. This database is meant for use in classification
and object recognition retrieval experiments.
Our goals for the current year are related to two new aspects of
the work:
- keyword indexing
- relational indexing
Since manual keyword indexing is very tedious and relational indexing
requires image analysis to find regions of interest, we are concentrating
on automatic segmentation and object recognition techniques. We are
developing techniques for recognition of such objects as vehicles, boats,
office buildings, and houses. The goal is to produce a large set of very
different features from standard color and texture features to regions
from different kinds of segmentation and linear features such as line
segments and circles. The features and relationships among them will
be combined into a large feature vector that will be the input to a
new hierarchical learning technique that was the subject of Yu-Yu
Chou's 1999 dissertation at the University of Washington.
Project Impact and Output
The results we have obtained will be used to make image retrieval
faster, easier, and more useful. The ideas should translate to use
in complex multimedia information systems, making them useful to
the scientific community as a whole, laymen with scientific interests,
and students at all levels.
This project has supported
the Ph.D. research of Andrew Berman, who received his
degree in March, 1999 and has partially supported the work
of Yu-Yu Chou, who received
his Ph.D. in December, 1999. Three undergraduate
students, Eva Brezin, Kent Schliter, and Marsha Eng,
participated in the project during the summers.
Yi Li, a 2nd year CSE graduate student, is now supported by the project.
Meanwhile, Andrew Berman has founded his own company, QueryPlus, in New Jersey
and is working on commercial applications of the funded work. Yu-Yu
Chou has just joined Numeritech in San Jose as their senior software
engineer.
Area Background
The area of content-based image retrieval is a hybrid research area that
requires knowledge of both computer vision and of database systems. Large
image databases are being collected, and images from these collections
made available to users in advertising, marketing, entertainment, and other
areas where images can be used to enhance the product. These images are
generally organized loosely by category, such as animals, natural scenes,
people, and so on. Their access is dependent on a user being willing to browse
large collections in order to select appropriate images.
Researchers in computer vision and computer graphics have developed
image distance measures that can compare a sample image or sketch provided
by a user to the images in the database and retrieve those that are judged
similar by the measure being used. Commercial systems like QBIC and Virage
utilize measures that are based on low-level attributes of the image itself,
including color histograms, color composition, and texture. State-of-the-art
research focuses on more powerful measures that can find regions of an
image corresponding to known objects that users wish to retrieve. There
has been some success in finding human faces of different selected sizes,
human bodies, horses, zebras and other texture animals with known patterns,
and such backgrounds as jungles, water, and sky.
Standard database systems, whether they be relational or object-oriented,
depend heavily on the ability to index the data according to keywords or
key phrases that are stored in the data. While images can be retrieved
in this way, it requires human classifiers to look at each image and select
a suitable set of keywords. Even if this could be done for millions of
images, it would be insufficient, as the keywords would only be one person's
ideas of the concepts in that image. Instead, both keywords and a large
and powerful set of image distance measures are needed.
Area References
J. Barros, J. French, W. Martin, P. Kelley, and M. Cannon, "Using the
triangle inequality to reduce the number of comparisons required for similarity-based
retrieval," IS&T/SPIE- Storage and Retrieval for Still Image and
Video Databases, Volume IV, (1996).
D. A. Forsyth, J. Malik, M. M. Fleck, H. Greenspan, T. Leung, S. Belongie,
C. Carson, and C. Bregler, "Finding pictures of objects in large collections
of images," Proceedings of the 2nd International Workshop on Object
Representation in Computer Vision, (1996).
T. Kato, T. Kurita, N. Otsu, K. Hirata, "A sketch retrieval method for
a full color image database," 11th International Conference on Pattern
Recognition, pp 530-533, (1992).
A. Del Bimbo, M. Campanai, P. Nesi, "3D visual query language for image
databases," Journal of Visual Languages and Computing, Vol 3, (1992).
A. Gupta, "Visual information retrieval: a Virage perspective," white
papaer available on the World Wide Web,
http://www.virage.com/literature/wpaper.html, (1995).
M. Flickner, H. Sawhnew, W. Niblack, J. Ashley, Q. Huang, B. Dom, M.
Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steel, P. Yanker,"Query by
image and video content: the QBIC system," Computer, pp 23-32, Vol
3, number 9, (1995).
A. Pentland, R. W. Picard, S. Sclaroff, "Photobook: tools for content-based
manipulation of image databases," Technical Report, Volume 255,
MIT, Media Lab., (1993)
R. K. Srihari, "Automatic indexing and content-baseds retrieval of captioned
images," IEEE Computer, Volume 28, number 9, pp 49-56, (1995).