Jen Smith

Computer Vision (CSE 455), Winter 2012

Project 4: Eigenfaces

Project Abstract

Objectives

This project uses eigenfaces to build a face recognition system. It works by taking a set of face images as input and calculating their eigenvectors to define a "face space." Once the face space has been constructed, the program can use it as a reference for finding faces in any input image.

Challenges

One of the hardest parts was simply wrapping my head around what exactly I was supposed to be implementing. I particularly struggled with getting the eigenfaces class to work, and was never able to get my program to perform completely correctly.

Lessons Learned

After doing this project, I have a much better insight into how face recognition systems work, and I also now understand (or can guess at) some of the reasons why they fail. The thing that surprised me the most was the relative simplicity of the eigenfaces concept. It took a while for me to grasp, but now that I understand it better, it really makes a lot of sense.

Implementation

This project takes a database of face images and uses them to construct a set of eigenfaces. An eigenface is basically a vector indicating a single direction in "face space", or the set consisting of all faces in the set of all images. Once the eigenfaces are computed, the program can then take any input image, project it onto the face space, and then use the eigenvectors to attempt to reconstruct a face image. The distance from the reconstructed image to the actual input image then defines a threshold indicating whether or not the image contains a face.

The project is structured as follows:

Experiment: Recognition

Methodology

First, I computed 10 eigenfaces using the set of cropped, nonsmiling images. The farthest left image is the average face.

Average face and eigenfaces

I then used a script to run recognizeface on the smiling_cropped face set, still using the nonsmiling images as the training set. I tried it with 1 to 33 eigenfaces and then plotted the result.

Questions

Plot for Question 1

It is pretty obvious from the plot that all is not quite right with the code. Rather than showing a clear correspondence between the number of eigenfaces used and the number of matches computed, the data instead jumps around quite a bit. It has a large plateau at only 3 correct matches for the entire interval from 8 to 26 eigenfaces, then jumps up to a maximum of 22 correct matches at 27 eigenfaces. This set of results makes it very hard to determine what the best number of eigenfaces might be. The lowest number that seems to produce anything near a desirable level of accuracy is 27. Unfortunately, this is over 80% of the size of the training data set and thus requires a fair amount of computation.

Recognition Errors for Question 2

With 10 eigenfaces, the recognition rate should have been around 79%, but with this implementation, only 3 faces were correctly matched. This is about a 3% success rate and thus terribly inaccurate.

Input face | false match, true match

Obviously, this match is pretty far off. The largest similarity is that the two images are both approximately the same amount out of focus. The correct image actually shows up 9th in the list of sorted results. What's going on here?

One immediate observation is that the input face does not look very much like its true match. The girl has her face turned at a slight angle away from the camera, whereas the neutral image is taken from virtually straight on. The true match actually looks more like the false match than the input image does! It is possible that there simply is no way to reconstruct the face from the computed eigenfaces. When converted to grayscale, it is likely that the skin tones are similar enough to cause recognition errors as well.

Input face | false match, true match

In this example, we see that even though the faces are not the same, there are still enough similarities between the two images to make the mistake not altogether unreasonable. Both people are wearing glasses and have similar skin tone, hair colour, and mouth shape. The true match actually shows up 3rd in the sorted list of matches, so the program is not really that far off. The input face also has a very extreme expression and is tilted downward.

Discussion

The results of this experiment were definitely not what I expected. My assumption was that the recognition rate would more or less increase with the number of eigenfaces used, and eventually level off at some value. Since each eigenvector represents a direction in face space, including more eigenfaces should allow the program to generate more faces (with more variation in them) to compute matches with. This plot sort of hints at this in that the rate for less than 8 eigenfaces is generally low, and the rate for 27 or more eigenfaces is generally higher, but the large number of instances where the recognition rate dives or plateaus to a very low value give cause for concern.

Also worth noting is that while recording these results, I kept track of which faces were matched at each iteration. Interestingly, the subset of faces matched at a lower number of eigenfaces was not always included in the set of faces matched with a higher number of eigenfaces. This indicates that a face that is matched when, say, 5 eigenvectors are used is no longer being matched when the number of eigenvectors increases, say, to 10. In theory, the amount of variance in the eigenvectors should be decreasing as we include more and more eigenvectors in the set. This leads me to believe that the program may have a bug either in the sorting of the eigenface set, or in how the eigenfaces themselves are being generated. In fact, looking at the eigenfaces, it appears that every even-numbered eigenface is the same, which seems to bear this out. If this is the case, then in reality we are only working off of n/2 + 1 eigenfaces, which means we can construct considerably fewer image variations. This may help account for the low recognition rate.

For the recognition experiment, it seems likely that such as turning the head or making extreme changes in facial expression would cause issues with the recognition algorithm. This may be especially true if, as mentioned in the previous paragraph, the number of eigenfaces being generated is not actually equal to the number specified.

Experiment: Find Faces

Methodology

Experiment #1 involved cropping elf.tga to the best face in the image. This was less than successful...

elf.tga with 10 eigenfaces

Sadly, not quite the desired output. However, when I reran it using the 27 eigenfaces file, which returned the highest face recognition rate in the previous experiment, the program fared a little better:

elf.tga with 27 eigenfaces

This was supposed to be repeated with another portrait. I didn't have a photo of myself, so Google came in handy.

Original image, cropped

Finally, we were supposed to find the faces in two different photos:

Group image from class

Another group image with at least 4 faces

Yeup. Pretty broken...

Questions

I used min_scale = 0.45, max_scale = 0.55 and step = 0.01 for each of the above images. I tried a few different step sizes, but the output was still so far off that I felt it didn't make much sense to spend a lot of time fiddling with inputs to a non-working program.

Every attempt to find faces resulted in false positives. There are a couple of issues here; first of all, I don't have sorting working in my program, so it is a virtual guarantee that the "best" match returned isn't actually the best. Also, I didn't implement any way to weed out non-face matches (e.g. by defining faces as only falling within a range of skin-coloured pixels, or some other method).

Discussion

This set of experiments was a bit odd since my 10-eigenfaces file barely worked as specified. I chose to run the experiments with this file anyway to see if it would shed any more light on what was going wrong. I also ran findface() with the 27-eigenface file against the same images. The theory was that since face matching partially works, the program should be able to find at least one face, especially in the larger group images. Unfortunately, this wasn't the case. In addition to the findface() function just plain not working, I noticed a few additional issues. First, matches are never found beyond the halfway point in any of the images. Second, the results almost always overlap each other. This would seem to indicate a definite issue with how I implemented the check for overlapping images. For the problem with the matches "clustering" together in one half of the image, there may be a mistake in how I'm scaling the pixel positions of the found faces back to the full-size image.

Experiment: Verify Faces

Methodology

I used the set of cropped, nonsmiling images to compute 6 eigenfaces and a corresponding userbase, then wrote a script to verify the nonsmiling images against the corresponding smiling face and a different smiling face.

Questions

I started with an MSE of 60000 and kept manually increasing it until I started getting matches. Since the MSE values being returned were so large, my search method was basically trial and error. The MSE that correctly verified all the faces was a ridiculous 5.0e16, with an error rate of 27 false positives when verifying against the set of different faces. For this set, the best MSE is somewhere around 5.0e15. At this threshold, 26 faces are correctly verified, while 15 are incorrectly verified. While still not great, it is still a slightly better ratio than any of the others.

This is a table of the number of false negatives and false positives for the thresholds I tried. At the MSE of 5.0e15, the false negative rate is 7/33, and the false positive rate is 15/33.

MSE False Negatives False Positives
60000 0/33 0/33
80000 0/33 0/33
100000 0/33 0/33
200000 0/33 0/33
1.0e+13 32/33 1/33
1.0e+14 26/33 4/33
1.0e+15 17/33 12/33
5.0e+15 7/33 15/33
1.0e+16 3/33 18/33
5.0e+16 0/33 27/33

Table of MSE threshold values

Discussion

It is evident from the table above that the calculated MSE values are coming up astronomically high. Matches aren't even identified until we reach the scale of 1e+013. Hooray, another bug to look for in my code! My best guess here is that I've forgotten to normalize some values somewhere, possibly in the projectFace() routine.

This also sheds some light on another reason why the findFace() routine doesn't work. I have a hard-coded MSE in there that I picked before doing any significant testing, and it is orders of magnitude smaller than the values returned in this experiment. It is highly likely that this low MSE value is partially responsible for the findFace() function's consistent lack of results.

Bells & Whistles

Speedup

Hoping to make testing easier, I implemented the eigenface calculation speedup, or at least attempted it. What this speedup basically does is allow the eigendecomposition to be performed on a NxN matrix (call it M), where N is the number of images in the training set.

Usually, M is generated by converting every image of width w by height h into a vector of dimension w*h. These images then become the rows of an Nx(w*h) matrix, which gets multiplied by its transpose in order to perform the eigendecomposition. This means that a set of 25x25 face images would result in a 625x625 matrix on which the eigenface calculations would have to be performed -- more than 390,000 entries!

It turns out that the eigenvectors of Mtranspose*M are the same as the eigenvectors of M*Mtranspose. This is what allows us to use the N*N matrix in the calculation. Additionally, each (i,j) entry in the resulting matrix is just the dot product of the ith and jth images in the data set.