Efficient Content-Based Image Retrieval
Linda G. Shapiro and Andrew P. Berman
Department of Computer Science and Engineering
University of Washington
Contact Information
Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington
PO Box 352350
Seattle, WA 98195-2350
Phone: (206) 543-2196
Fax : (206) 543-2969
Email: shapiro@cs.washington.edu
Keywords
content-based image retrieval, image database, image indexing, image matching,
distance measures
Project Award Information
-
IRI-9711771
-
Efficient Content-Based Image Retrieval
-
Three years
-
September 15, 1997 -- August 31, 2000
Project Summary
The focus of our work is the development of a general, scalable architecture
to support fast querying of very large image databases with user-specified
distance measures. We will develop algorithms and data structures for efficient
image retrieval from large databases with multiple distance measures.
We will also investigate methods for merging our general, distance-measure-independent
method with other useful techniques that may be distance measure specific,
such as keyword retrieval and relational indexing. Finally, we will look
at the problem of providing users with multiple distance measures
of many different varieties. We will develop both new methods for combining
distance measures and a language in which users can specify their queries
without detailed knowledge of the underlying metrics. We will build a prototype
system to test our methods and evaluate it on both a large general image
database and a smaller controlled database.
List of Supported Students
Goals, Objectives, and Targeted Activities
First Year Goals:
-
Develop initial search methods that can handle multiple distance measures
and multiple keys -- Done.
-
Design and test methods for selecting a good set of keys with which to
prune the database during searches -- Done.
-
Design and implement suitable data structures for efficient manipulation
of the stored key-image distances in order to speed up the elimination
phase of the search. The elimination phase should be sublinear in
the number of images; that is, for a database of size n, we would
like even the small intermediate computations to be performed on n^e
images, where e < 1. Minimizing e is our goal. -- Done.
-
Collect a database of ground-truth images with known classifications for
use in testing the developed structures and methods. -- Done.
-
Perform initial testing on the ground-truth database and on our larger
general collection of 1800 images, which was obtained from the internet. -- Done.
-
Two conference papers presented.
Second Year Goals:
-
Develop methods for merging the distance-measure-based techniques
with keyword search.
-
Develop methods for mergine the distance-measure-based techniques with
relational indexing techniques.
-
Design methods by which users can create distance measures as combinations
of primitive metrics and keywords and relational indexing.
-
Design and implement an interface for users to develop their queries.
We have developed a prototype interface and are in the process of
integrating it into the system.
-
Begin construction of a much larger test database of at least 100,000
images for a system evaluation. 37,000 images have been placed in the
database so far.
-
Two conference papers accepted, one journal paper submitted,
one dissertation in final stages.
Indication of Success
We have met our first-year goals and are on schedule to meet our
second-year goals. We have developed
a prototype system, FIDS, with which the user can conduct
searches using complex combinations of several dozen distance measures.
We incorporated a new set of data structures and algorithms into the
latest version of FIDS that has led to a marked speed up of the elimination
phase of the search in large image databases.
Currently, FIDS contains over 37,000 images. A complex search through
the database can be performed in just a few seconds.
We have a designed and begun implementation of a prototype interface to
enable the non-expert user to query the database in an expressive manner.
We have collected several hundred images for the ground-truth
database.
We have two new conference papers, one
published in the 1998 International Conference
on Pattern Recognition describing the overall system, and one
published in the 1999 SPIE workshop on
Content-Based Image and Video Retrieval
on performance evaluation. We have submitted a journal
paper to Computer Vision and Image Understanding.
Our long range goal is to facilitate the creation of easy-to-use database
systems consisting of very large numbers of images. Image databases
are growing both in size and number. A database of one million high-quality
digitized photographs can use tens of terabytes of storage.
Searching through such a database can be prohibitively time-consuming.
There is thus a definite need for the types of services we are developing.
Project Impact and Output
The project has been in place for a year and a half. It has facilitated
the PhD candidacy of Andrew Berman, who is expected to received his
degree sometime in February. Last summer, two undergraduate
students, Eva Brezin from Columbia University, and Marsha Eng from the
University of Washington participated in the project. This past
summer, University of Washington undergraduate Kent Schliter participated
in the project. Currently, two new graduate students, Yi Li and Zhenrong Qian,
are participating in the project.
A new interdisciplinary KDI proposal related to this research is
being prepared.
Project References
A. P. Berman. "A new data structure for fast approximate matching." Technical
Report 1994-03-02. Department of Computer Science, University of Washington.
March, 1994.
A. Berman and L. G. Shapiro. "Efficient image retrieval with multiple
distance measures." Proceedings of the SPIE Conference on Storage and
Retrieval for Image and Video Databases. February, 1997.
A. P. Berman and L. G. Shapiro. "Selecting good keys for triangle-inequality-based
pruning algorithms." IEEE International Workshop on Content-Based Access
of Image and Video Databases, January 1998..
A. P. Berman and L. G. Shapiro. "A Flexible Image Database System for Content-Based Retrieval."
17th International Conference on Pattern RecognitionAugust, 1998
A. P. Berman and L. G. Shapiro. "Triangle-Inequality-Based Pruning Algorithms
with Triangle Tries."Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases. January, 1999.
GPRA Outcome Goals
The results we have obtained will be used to make
image retrieval faster and easier for people in all walks of life.
Furthermore, the ideas should translate to use in
complex multimedia information systems, making them useful to
the scientific community as a whole, to laymen with scientific interests,
and to students at all levels.
Area Background
The area of content-based image retrieval is a hybrid research area that
requires knowledge of both computer vision and of database systems. Large
image databases are being collected, and images from these collections
made available to users in advertising, marketing, entertainment, and other
areas where images can be used to enhance the product. These images are
generally organized loosely by category, such as animals, natural scenes,
people, and so on. Their access is dependent on a user being willing to browse
large collections in order to select appropriate images.
Researchers in computer vision and computer graphics have developed
image distance measures that can compare a sample image or sketch provided
by a user to the images in the database and retrieve those that are judged
similar by the measure being used. Commercial systems like QBIC and Virage
utilize measures that are based on low-level attributes of the image itself,
including color histograms, color composition, and texture. State-of-the-art
research focuses on more powerful measures that can find regions of an
image corresponding to known objects that users wish to retrieve. There
has been some success in finding human faces of different selected sizes,
human bodies, horses, zebras and other texture animals with known patterns,
and such backgrounds as jungles, water, and sky.
Standard database systems, whether they be relational or object-oriented,
depend heavily on the ability to index the data according to keywords or
key phrases that are stored in the data. While images can be retrieved
in this way, it requires human classifiers to look at each image and select
a suitable set of keywords. Even if this could be done for millions of
images, it would be insufficient, as the keywords would only be one person's
ideas of the concepts in that image. Instead, both keywords and a large
and powerful set of image distance measures are needed.
Most of the work in content-based image retrieval has focused on particular
distance measures, not on organization or indexing. While some researchers
have provided indexing techniques for their own distance measures, these
are not suitable for a general system that employs tens or even hundreds
of such measures. Our work on this project focuses on the organization
of a database of images and new indexing techniques that will allow a user
to search for images similar to a given query image or sketch, using a
dynamic combination of the distance measures provided by the system. Furthermore,
the search will take place in in sublinear time; that is, most of the images
in the database will be ruled out by our indexing techniques and
only a small subset will be compared to the query image with the
user-developed distance measure to select the final small subset
of best matches through which the user can browse. We expect this
work to be an important part of the image retrieval process.
Area References
R. Baeza-Yates, W. Cunto, U. Manber, and S. Wu, "Proximity matching using
fixed-queries trees," Combinatorial Pattern Matching, pp 198-212,
Springer-Verlag, June (1994).
J. Barros, J. French, W. Martin, P. Kelley, and M. Cannon, "Using the
triangle inequality to reduce the number of comparisons required for similarity-based
retrieval," IS&T/SPIE- Storage and Retrieval for Still Image and
Video Databases, Volume IV, Jan (1996).
D. A. Forsyth, J. Malik, M. M. Fleck, H. Greenspan, T. Leung, S. Belongie,
C. Carson, and C. Bregler, "Finding pictures of objects in large collections
of images," Proceedings of the 2nd International Workshop on Object
Representation in Computer Vision, April, 1996.
T. Kato, T. Kurita, N. Otsu, K. Hirata, "A sketch retrieval method for
a full color image database," 11th International Conference on Pattern
Recognition, pp 530-533, (1992).
A. Del Bimbo, M. Campanai, P. Nesi, "3D visual query language for image
databases," Journal of Visual Languages and Computing, Vol 3, (1992).
A. Gupta, "Visual information retrieval: a Virage perspective," white
papaer available on the World Wide Web, http://www.virage.com/literature/wpaper.html
(1995).
M. Flickner, H. Sawhnew, W. Niblack, J. Ashley, Q. Huang, B. Dom, M.
Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steel, P. Yanker,"Query by
image and video content: the QBIC system," Computer, pp 23-32, Vol
3, number 9, (September 1995).
A. Pentland, R. W. Picard, S. Sclaroff, "Photobook: tools for content-based
manipulation of image databases," Technical Report, Volume 255,
MIT, Media Lab., (1993)
R. K. Srihari, "Automatic indexing and content-baseds retrieval of captioned
images," IEEE Computer, Volume 28, number 9, pp 49-56, (1995).
Potential Related Projects
The PI is involved in a collaborative effort with Oceanography
Professor Jeffrey Richey on the PRIM project for a Puget Sound
Regional Information System. A KDI proposal is being written that
links our current NSF-funded results with PRISM.