We highlight the kernel view of SIFT, HOG, and bag of visual words, and show that histogram features are a special, rather restricted case of efficient match kernels. This novel insight allows us to design a family of kernel descriptors. Kernel descriptors avoid the need for pixel attribute discretization and are able to turn any pixel attribute into compact patch-level features. Match kernels are extremely flexible and easy to incorporate domain knowledge, since the similarity measure between pixel attributes can be any positive definite kernel, such as the popular Gaussian kernel function. To compute kernel descriptors, one has to move to the feature space forming the kernel function. Thus, for computational efficiency and for representational convenience, we reduce the dimensionality by projecting the high/infinite dimensional feature vector to a set of finite basis vectors using kernel principal component analysis. This procedure can approximate the original match kernels very well.

We have developed eight types of kernel descriptors for RGB-Depth images, a relative complete feature sets, to capture rich cues for robust object recognition. Our kernel descriptors outperform carefully tuned recognition algorithms on top of SIFT on many benchmarks: USPS, extended Yaleface, Scene-15, Caltech-101, CIFAR-10, CIFAR-10-ImageNet, and RGB-D object dataset. More importantly, our kernel descriptors have exhibited very robust performance in several real world recognition systems: the object-aware situated interactive system (OASIS). The OASIS Lego demo was shown live at the Consumer Electronics Show 2011.

Please visit the KDES Website for more details.