Project 3: EigenFaces
Ji Hoo
Testing Recognition with Cropped Class Images
My percentage of faces versus number of eigenfaces used plot is shown below.
There are some obvious anomalies with this plot. For example, I got 12/21 (~57%) recognition using 10 eigenfaces, but got 11/21 (~52%) with 11 eigenfaces. I believe that such kinks are probably random in nature.
I added an exponential trend-line to show the seemingly logarithmic growth of % Recognition, which seems to plateau-off at 21 Eigenfaces used. With added EigenFaces, we are probably not going to see %Recognition go too much higher than ~62%.
There is not a clear answer to how many eigenfaces one should use. We could, however, probably draw the conclusion that the %Recognition / #EigenFaces curve will tend towards an asymptotic value which makes the %Recognition gain per addition #EigenFace used increasingly lower with more eigenfaces. A good metric for development is to find the slope of the %Recognition/#EigenFaces curve; once we detect the plateau, we should be confident that we have a reasonable number of eigenfaces.
The mistakes are pretty reasonable; most were close to being the top ranked… but there are some impossible circumstances, like the following pictures:
It will take a huge leap to match these two! ;p
Here is my 10 eigenfaces, followed by my average face.
Cropping and Finding Faces
This is elf.tga (in JPEG format)
This
is my crop:
I
was hoping to catch the elf instead.
Here’s
group1:
And
here is my face-find on group1:
Thank
goodness this worked.
Here’s…
someone-else’s family!
And
here’s my face finding results on it:
Some
hits and misses here… To catch the missing two, I probably need a training set
with more feminine faces. Then again, my software is optimized for our training
images (discussed below).
This
is me:
For
a very long time, I found this (I was thinking that my shirt looked more like a
face, as per our training set, than my face):
After
getting the scaling correct, I finally found me!
Discussion
Two
major hurdles with this aspect of the project (for me) is the overlapping of
windows, and false positive hits on regions of low texture.
An
example of overlapping windows is shown below:
It
took me quite a while to figure this out… and I am glad I did.
Here
is an example of low-texture areas firing as false positives:
To
solve this problem, I explored the characteristics of false positives and
actual hits. I compared average intensities, variance with outputs as shown
below:
1st Face Error: 8.567979E+001
average I: 8.156785E+001
Intensity Variance: 2.177204E+003
average gFaceI: 2.167664E+001
Saved Targa image '1stFaceGFace.tga'
1st false Error: low intensity 2.312451E+002
:average I: 4.356662E+001
Intensity Variance: 4.544872E+002
average gFaceI: 7.457051E+000
Saved Targa image 'lowIntensityFalse1.tga'
4th false Error; ~high intensity 4.863546E+002
:average I: 8.374566E+001
Intensity Variance: 1.790331E+003
average gFaceI: 1.700599E+001
Saved Targa image 'HighIntensityFalse2.tga'
...
I
discovered that among the characteristics that I came across, the best metric
to decide if the algorithm is processing a false positive is that of the Average Absolute Gradient (AAG) value
(indicated as the average gFaceI value above). For each
sampling window that I process, I calculate the corresponding gradient map, and
save the absolute value of the gradient at each pixel location into another
image. I then calculate the AAG value across the entire sampling window. Given
that false positives are triggered by regions of low texture, such regions
should have a relatively lower AAG value. By taking the absolute value of the
pixel gradients instead of the raw pixel gradients, I prevent the average value
from being amortized by the existence of positive and negative values (of
course, the average of the gradients-squared works as well).
After
determining the threshold value
heuristically, I added this as a filter for points before they are considered
to be added to the array of hits, and it dramatically improved the fidelity of
my face-finding. However, given that the threshold value is tailored for our
class images, it is not necessarily optimized or even suitable for other sets
of pictures with different sets of characteristics like lighting, focus, etc.
There is the option of using color cues,
as hinted; maybe next time! : )