Efficient Content-Based Image Retrieval

Linda G. Shapiro and Andrew P. Berman
 

Department of Computer Science and Engineering
University of Washington
 

Contact Information

Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington
PO Box 352350
Seattle, WA 98195-2350
Phone: (206) 543-2196
Fax : (206) 543-2969
Email:  shapiro@cs.washington.edu Keywords content-based image retrieval, image database, image indexing, image matching, distance measures

Project Award Information

Project Summary

The focus of our work is the development of a general, scalable architecture to support fast querying of very large image databases with user-specified distance measures. We will develop algorithms and data structures for efficient image retrieval from large databases with multiple distance measures. We will also investigate methods for merging our general, distance-measure-independent method with other useful techniques that may be distance measure specific, such as keyword retrieval and relational indexing. Finally, we will look at the problem of providing users with multiple distance measures of many different varieties. We will develop both new methods for combining distance measures and a language in which users can specify their queries without detailed knowledge of the underlying metrics. We will build a prototype system to test our methods and evaluate it on both a large general image database and a smaller controlled database.

List of Supported Students

Goals, Objectives, and Targeted Activities

First Year Goals: Second Year Goals:

Indication of Success

We have met our first-year goals and are on schedule to meet our second-year goals.  We have developed a prototype system, FIDS, with which the user can conduct searches using complex combinations of several dozen distance measures. We incorporated a new set of data structures and algorithms into the latest version of FIDS that has led to a marked speed up of the elimination phase of the search in large image databases. Currently, FIDS contains over 37,000 images. A complex search through the database can be performed in just a few seconds. We have a designed and begun implementation of a prototype interface to enable the non-expert user to query the database in an expressive manner. We have collected several hundred images for the ground-truth database. We have two new conference papers, one published in the 1998 International Conference on Pattern Recognition describing the overall system, and one published in the 1999 SPIE workshop on Content-Based Image and Video Retrieval on performance evaluation. We have submitted a journal paper to Computer Vision and Image Understanding.

Our long range goal is to facilitate the creation of easy-to-use database systems consisting of very large numbers of images.  Image databases are growing both in size and number. A database of one million high-quality digitized photographs can use tens of terabytes of storage.   Searching through such a database can be prohibitively time-consuming.  There is thus a definite need for the types of services we are developing.

Project Impact and Output

The project has been in place for a year and a half.  It has facilitated the PhD candidacy of Andrew Berman, who is expected to received his degree sometime in February.  Last summer, two undergraduate students, Eva Brezin from Columbia University, and Marsha Eng from the University of Washington participated in the project. This past summer, University of Washington undergraduate Kent Schliter participated in the project. Currently, two new graduate students, Yi Li and Zhenrong Qian, are participating in the project. A new interdisciplinary KDI proposal related to this research is being prepared.

Project References

A. P. Berman. "A new data structure for fast approximate matching." Technical Report 1994-03-02. Department of Computer Science, University of Washington. March, 1994.

A. Berman and L. G. Shapiro. "Efficient image retrieval with multiple distance measures." Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases. February, 1997.

A. P. Berman and L. G. Shapiro. "Selecting good keys for triangle-inequality-based pruning algorithms." IEEE International Workshop on Content-Based Access of Image and Video Databases, January 1998..

A. P. Berman and L. G. Shapiro. "A Flexible Image Database System for Content-Based Retrieval." 17th International Conference on Pattern RecognitionAugust, 1998

A. P. Berman and L. G. Shapiro. "Triangle-Inequality-Based Pruning Algorithms with Triangle Tries."Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases. January, 1999.
 

GPRA Outcome Goals

The results we have obtained will be used to make image retrieval faster and easier for people in all walks of life. Furthermore, the ideas should translate to use in complex multimedia information systems, making them useful to the scientific community as a whole, to laymen with scientific interests, and to students at all levels.

Area Background

The area of content-based image retrieval is a hybrid research area that requires knowledge of both computer vision and of database systems. Large image databases are being collected, and images from these collections made available to users in advertising, marketing, entertainment, and other areas where images can be used to enhance the product. These images are generally organized loosely by category, such as animals, natural scenes, people, and so on. Their access is dependent on a user being willing to browse large collections in order to select appropriate images.

Researchers in computer vision and computer graphics have developed image distance measures that can compare a sample image or sketch provided by a user to the images in the database and retrieve those that are judged similar by the measure being used. Commercial systems like QBIC and Virage utilize measures that are based on low-level attributes of the image itself, including color histograms, color composition, and texture. State-of-the-art research focuses on more powerful measures that can find regions of an image corresponding to known objects that users wish to retrieve. There has been some success in finding human faces of different selected sizes, human bodies, horses, zebras and other texture animals with known patterns, and such backgrounds as jungles, water, and sky.

Standard database systems, whether they be relational or object-oriented, depend heavily on the ability to index the data according to keywords or key phrases that are stored in the data. While images can be retrieved in this way, it requires human classifiers to look at each image and select a suitable set of keywords. Even if this could be done for millions of images, it would be insufficient, as the keywords would only be one person's ideas of the concepts in that image. Instead, both keywords and a large and powerful set of image distance measures are needed.

Most of the work in content-based image retrieval has focused on particular distance measures, not on organization or indexing. While some researchers have provided indexing techniques for their own distance measures, these are not suitable for a general system that employs tens or even hundreds of such measures. Our work on this project focuses on the organization of a database of images and new indexing techniques that will allow a user to search for images similar to a given query image or sketch, using a dynamic combination of the distance measures provided by the system. Furthermore, the search will take place in in sublinear time; that is, most of the images in the database will be ruled out by our indexing techniques and only a small subset will be compared to the query image with the user-developed distance measure to select the final small subset of best matches through which the user can browse. We expect this work to be an important part of the image retrieval process.

Area References

R. Baeza-Yates, W. Cunto, U. Manber, and S. Wu, "Proximity matching using fixed-queries trees," Combinatorial Pattern Matching, pp 198-212, Springer-Verlag, June (1994).

J. Barros, J. French, W. Martin, P. Kelley, and M. Cannon, "Using the triangle inequality to reduce the number of comparisons required for similarity-based retrieval," IS&T/SPIE- Storage and Retrieval for Still Image and Video Databases, Volume IV, Jan (1996).

D. A. Forsyth, J. Malik, M. M. Fleck, H. Greenspan, T. Leung, S. Belongie, C. Carson, and C. Bregler, "Finding pictures of objects in large collections of images," Proceedings of the 2nd International Workshop on Object Representation in Computer Vision, April, 1996.

T. Kato, T. Kurita, N. Otsu, K. Hirata, "A sketch retrieval method for a full color image database," 11th International Conference on Pattern Recognition, pp 530-533, (1992).

A. Del Bimbo, M. Campanai, P. Nesi, "3D visual query language for image databases," Journal of Visual Languages and Computing, Vol 3, (1992).

A. Gupta, "Visual information retrieval: a Virage perspective," white papaer available on the World Wide Web, http://www.virage.com/literature/wpaper.html (1995).

M. Flickner, H. Sawhnew, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steel, P. Yanker,"Query by image and video content: the QBIC system," Computer, pp 23-32, Vol 3, number 9, (September 1995).

A. Pentland, R. W. Picard, S. Sclaroff, "Photobook: tools for content-based manipulation of image databases," Technical Report, Volume 255, MIT, Media Lab., (1993)

R. K. Srihari, "Automatic indexing and content-baseds retrieval of captioned images," IEEE Computer, Volume 28, number 9, pp 49-56, (1995).

Potential Related Projects

The PI is involved in a collaborative effort with Oceanography Professor Jeffrey Richey on the PRIM project for a Puget Sound Regional Information System. A KDI proposal is being written that links our current NSF-funded results with PRISM.