1. Feature Detector

    The detector was implemented according to the lecture notes.
     Also, I computed their scale by applying 3 level pyramid of  Gaussian images.
 
        Level 0: Original image
        Level 1: Blurred and sub-sampled image from the original image
        Level 2: Blurred and sub-sampled image from the lower level (level 2)

    For each level, Harris operator was applied and thus each pixel has 3 Harris values.
    The maximum was selected as final Harris value for each pixel and the level with the maximum value was assigned as the scale of a feature point.

2. Feature Descriptor

    I tried 3 different descriptors

    (1) Simple 5X5 windows
           
            It simply samples 25 pixels from 5X5 window centered on detected interesting points.
   
    (2) MOPS

            I have implemented MOPS feature descriptor as explained in [1] and lecture notes.

            (1) Compute the angle of a feature point using X- and Y-axis gradients.
            (2) Put 40X40 window on a feature point.
            (3) Align the window horizontally, i.e, rotate the patch according to the angle.
            (4) Subsample 9X9 pixels from the 40X40 patches. Originally, they used 8X8 sub-sampled pixels.
                 If the interesting point is detected at pyramid level l, pixels are sub-sampled from (l+1) level pyramid image.
            (5) Normalize the descriptor by subtracting the mean from its elements and dividing them by the standard deviation.

    (3) Simple Histogram of Gradient

            I have implemented a simple version of Histogram of Gradient (SHoG).

            (1) Compute X- and Y-axis gradients for an input image.
            (2) Put 10X10 window on a feature point.
            (3) Construct 18-bin orientation histogram by accumulating the magnitudes according to angles of feature points.
                  If the interesting point is detected at pyramid level l, magnitudes are obtained from l-level pyramid image.
            (4) Circularly shift the bins so that a bin with the maximum magnitude becomes a first bin in a histogram.
            (5) Normalize the histogram by its maximum value.


3. Design and Implementation Issue

(1) For the feature detector, I used pyramid of Gaussian images.
At first, I applied the feature detection procedure without pyramid of Gaussian images,
but it could barely detect interesting points from bike image 5 and 6 because they are too much blurred.
After thresholding the images, almost no feature points were obtained from the image of zero level (the left-most image).
After applying Pyramid of Gaussian images, I could obtain some feature points.
The following images show pyramid of images and their Harris values. They are multiplied by 1,000 to make them more visible.

     
                                                Level 0                                                               Level 1                         Level 2
                                                                   <  Pyramid of Gaussian Images: Bikes 6>
 
     
                                                Level 0                                                               Level 1                         Level 2
                                                             <  Harris Images: Bikes 6>



(2) To calculate gradients, I used first derivative of Gaussian instead of the Sobel operator.
It is well known that the derivative of Gaussian is more robust to noise than the Sobel operator.
In real implementation, 5X5 window was employed.

                               
                        <First Derivative of Gaussian Filter>


(3) The given skeleton codes had some bugs in 'Convolve()' function.
It probably did not affect the algorithm's performance too much,  but the original codes did not work at all with 'UpLevel()' function in Pyramid class.
I had modified the codes using my own found bugs and with help from the posting of Jefferey.

        Modified Codes:

                double sum = 0;
                for (int k = 0; k < kShape.height; k++)
                {
                    for (int l = 0; l < kShape.width; l++)
                    {
                        /* Original Codes
                        if ((x-kernel.origin[0]+k >= 0) && (x-kernel.origin[0]+k < sShape.width) && (y-kernel.origin[1]+l >= 0) && (y-kernel.origin[1]+l < sShape.height))
                        {
                            sum += kernel.Pixel(k,l,0) * src.Pixel(x-kernel.origin[0]+k,y-kernel.origin[1]+l,c);
                        }
                        */
                        // Modified Codes
                        if ((x+kernel.origin[0]+l >= 0) && (x+kernel.origin[0]+l < sShape.width) && (y+kernel.origin[1]+k >= 0) && (y+kernel.origin[1]+k < sShape.height))
                        {
                            sum += kernel.Pixel(l,k,0) * src.Pixel(x+kernel.origin[0]+l,y+kernel.origin[1]+k,c);
                        }
                    }
                }
                dst.Pixel(x,y,c) = (T) __max(dst.MinVal(), __min(dst.MaxVal(), sum));



4. Performance Report for Graf and Yosemite

(1) Harris Images for Graf and Yosemite

The following images show pyramid of images and their Harris values. They are multiplied by 1,000 to make them more visible.
Harris scores was multiplied by 1,000 to make it more visible.

     
                             Level 0                                               Level 1                    Level 2                                        Level 0                                              Level 1                    Level 2

     
                             Level 0                                               Level 1                    Level 2                                        Level 0                                              Level 1                   Level 2



(2) AUC and ROC for Graf and Yosemite with MPOS Descriptor
 
There are the test results using the MPOS descriptor.
Simple 5X5 Window MPOS
SSD Ratio SSD Ratio
Graf 0.601380 0.716935 0.767863 0.840617
Yosemite 0.949337 0.871425 0.893986 0.922928

                                                                      < AUC with MPOS Feature Descriptor>


                                                             <Graf>                                                                                                                     <Yosemite>

                                                                                               < ROC with MPOS Descriptor>


(3) AUC and ROC for Graf and Yosemite with SHoG Descriptor

There are the test results using the SHoG descriptor.

Simple 5X5 Window SHoG
SSD Ratio SSD Ratio
Graf 0.601380 0.716935 0.707428 0.505639
Yosemite 0.949337 0.871425 0.806369 0.776603

                                                                      < AUC with MPOS Feature Descriptor>





                                                                                               < ROC with SHoG Descriptor>


5. Performance Report for Benchmark Datasets using MOPS Descriptor

(1) AUC for MOPS

Bikes Graf Leuven Wall
Simple 5X5 MOPS Simple 5X5 MOPS Simple 5X5 MOPS Simple 5X5 MOPS
SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio
1 to 2 0.520477 0.646873 0.836848 0.878549 0.601380 0.716935 0.767863 0.840617 0.400132 0.615837 0.856909 0.899898 0.588455 0.698919 0.913984 0.919485
1 to 3 0.537866 0.682949 0.858690 0.845195 0.533435 0.691536 0.785318 0.823008 0.197313 0.634088 0.869978 0.850670 0.561754 0.685365 0.906628 0.899969
1 to 4 0.630400 0.721574 0.916821 0.920720 0.564011 0.647527 0.631655 0.721040 0.352941 0.693137 0.885851 0.934788 0.543599 0.616714 0.875732 0.884777
1 to 5 0.567745 0.573099 0.881388 0.829583 0.448628 0.594624 0.552447 0.672306 0.045010 0.678082 0.934653 0.950786 0.415668 0.612473 0.796353 0.817903
1 to 6 0.469473 0.489226 0.928649 0.872378 0.631706 0.468822 0.745902 0.525956 0.037182 0.580235 0.858076 0.888039 0.442719 0.469540 0.769981 0.651870
Average 0.545192 0.622744 0.884479 0.869285 0.555832 0.623889 0.696637 0.716585 0.206516 0.640276 0.881093 0.904836 0.510439 0.616602 0.852536 0.834801


(2) Average Pixel Error for MOPS


Bikes Graf Leuven Wall
Simple 5X5 MOPS Simple 5X5 MOPS Simple 5X5 MOPS Simple 5X5 MOPS
SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio
1 to 2 252.216521 * 192.850547 * 216.762120 * 214.352871 * 320.073363 * 208.402465 * 287.496260 * 245.638696 *
1 to 3 287.574315 * 218.918309 * 215.334316 * 216.680018 * 348.562424 * 238.799857 * 302.244337 * 269.444685 *
1 to 4 291.947181 * 251.195576 * 249.902211 * 258.476088 * 364.481320 * 272.263070 * 326.551455 * 321.830911 *
1 to 5 316.373814 * 271.870703 * 231.255573 * 279.467612 * 357.773007 * 278.659521 * 397.561260 * 348.350584 *
1 to 6 323.488355 * 292.244289 * 250.132000 * 293.201983 * 387.338809 * 304.072435 * 384.408618 * 386.268992 *
Average 294.320037 * 245.415885 * 232.677244 * 252.435714 * 355.645785 * 260.439469 * 339.652386 * 314.306774 *

Note: * indicates that a value is equal to that of its left cell.


6. Performance Report for Benchmark Datasets using SHoG Descriptor

(1) AUC for SHoG


Bikes Graf Leuven Wall
Simple 5X5 SHoG Simple 5X5 SHoG Simple 5X5 SHoG Simple 5X5 SHoG
SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio
1 to 2 0.520477 0.646873 0.759675 0.779799 0.601380 0.716935 0.707428 0.505639 0.400132 0.615837 0.812513 0.834165 0.588455 0.698919 0.640223 0.721789
1 to 3 0.537866 0.682949 0.780198 0.824691 0.533435 0.691536 0.608422 0.465565 0.197313 0.634088 0.783544 0.811488 0.561754 0.685365 0.612144 0.690030
1 to 4 0.630400 0.721574 0.796617 0.838308 0.564011 0.647527 0.586530 0.504566 0.352941 0.693137 0.781254 0.726508 0.543599 0.616714 0.503173 0.562073
1 to 5 0.567745 0.573099 0.833663 0.729997 0.448628 0.594624 0.799040 0.740055 0.045010 0.678082 0.736864 0.761121 0.415668 0.612473 0.425575 0.452103
1 to 6 0.469473 0.489226  0.885151 0.790604 0.631706 0.468822 0.159153 0.304645 0.037182 0.580235 0.712928 0.718311 0.442719 0.469540 0.548355 0.713902
Average 0.545192 0.622744 0.811061 0.792680 0.555832 0.623889 0.572115 0.504094 0.206516 0.640276 0.765421 0.770319 0.510439 0.616602 0.545894 0.627979


(2) Average Pixel Error for SHoG


Bikes Graf Leuven Wall
Simple 5X5 SHoG Simple 5X5 SHoG Simple 5X5 SHoG Simple 5X5 SHoG
SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio SSD Ratio
1 to 2 252.216521 * 224.492398 * 216.762120 * 258.299126 * 320.073363 * 229.540048 * 287.496260 * 294.649230 *
1 to 3 287.574315 * 232.872365 * 215.334316 * 249.911122 * 348.562424 * 272.130749 * 302.244337 * 301.220350 *
1 to 4 291.947181 * 262.912727 * 249.902211 * 285.863968 * 364.481320 * 303.092765 * 326.551455 * 321.936495 *
1 to 5 316.373814 * 277.228958 * 231.255573 * 290.280118 * 357.773007 * 336.119879 * 397.561260 * 353.715023 *
1 to 6 323.488355 * 295.312835 * 250.132000 * 290.160905 * 387.338809 * 344.569287 * 384.408618 * 383.460122 *
Average 294.320037 * 258.563857 * 232.677244 * 274.903048 * 355.645785 * 297.090545 * 339.652386 * 330.996244 *

Note: * indicates that a value is equal to that of its left cell.


7. My Own Pictures

I took 4 pictures and performed object search experiment.
In the following pictures, the image in the first row is a query image and a database has 3 images.
The image in the last row shows the search results with a query.


                                                                 
                                                                                     <Query>


   
                                                                                 <Images in Database>

                     
                                                                                        <Search result>


8. Strengths and Weaknesses

(1) The feature detector is scale-invariant since pyramid of Gaussian images was applied. Please, refer to Section 3.(1).

(2) Instead of the Sobel operator, the derivative of Gaussian was used to calculate image gradients, which is more robust to noise.
Note that image gradients are used in many parts in this project. It is used to calculate Harris values, the orientation of feature points and feature descriptors.
Please, refer to Section 3.(2).

(3) The MOPS descriptor is robust to changes:
      in scale because pyramid of Gaussian images was applied. It can be confirmed by the test results for Bikes datasets in Section 5.(1)
      in position because the information contained in a pixel (gray value) is spread to its neighbors by blurring pixels using Gaussian filter. 
      in orientation because the 40X40 window is aligned horizontally according to the angle of a feature point.
      in illumination (or contrast) because subsampled pixels are normalized by their mean and standard deviation. It can be confirmed by the test results for Leuven datasets in Section 5.(1).
                  
(4) The SHoG descriptor is robust to changes:
      in scale because pyramid of Gaussian images was applied. It can be confirmed by the test results for Bikes datasets in Section 5.(2)
      in position because the descriptor is based on histogram. However, it loses spatial information.
      in orientation because the feature descriptor is aligned so that a bin with the maximum magnitude becomes a first bin in a histogram.
      in illumination (or contrast) because the descriptor is based on image gradients, which are calculated by relative difference between pixels. It can be confirmed by the test results for Leuven datasets in Section 5.(2)
     

9. Extra Credit

Please refer to the explanation in Section 8.

(1) The implemented features are contrast invariant.
(2) The feature detector is scale invariant.
(3) Two additional feature descriptors were implemented and compared with the simple 5X5 window descriptor.

10. Command Argument

Feature Type Option Value
Simple 5X5 Window 2
MOPS 3
SHoG 7


Match Type Option Value
SSD 1
Ratio Test 2


Reference

[1] Multi-Image Matching using Multi-Scale Oriented Patches