Minh Quan Nguyen - CSE 455 Project 4

Project Abstract

Objectives

The purpose of thise project was to create a facial recognition system using principal component analysis to identify faces.

Challenges

The most challenging parts for me were constructing the covariance matrix, understanding that the eigenvalues computed were in fact the eigenfaces we needed to compute, and implementing the find faces method.

Lessons Learned

It was interesting to learn about principle component analysis as away to organize data points and use the results to test if other data fall within range we want.

Implementation

The project is structured as follows:

Faces::eigenFaces - Given a set of training face vectors, this method computes the convariance matrix and finds eigenvectors, which are the principle components of the distribution of faces. It then sorts the eigenvectors by their eigenvalues and saves the n eigenvectors with the largest eigenvalues since these capture the most variance from the training data. That is how we get our eigenfaces.
EigFaces::projectFace - With the eigenfaces computed and given a face, this method computes the coefficients that weight each eigenface in the linear combination of eigenfaces that represents a give face.
EigFaces::constructFace - Given the coefficents from projectFace, this method reconstructs a face by representing it as a linear combination of the average face and the eigenfaces.
EigFaces::isFace - This method is given a face and with it, uses projectFace to project the face onto face space and then uses constructFace is reconstruct the face. It then computes the mean squared error between the given face and the reconstructed face. If the error falls within an acceptable range from the distribution of faces, it is determined to a face.
EigFaces::verifyFace - The intended use of this method is to determine whether a face belongs to the given user. It projects the given face of the user onto face space and also uses the user's previously computed coefficients to calculate the mean squared error between the two. If the MSE falls below a certain threshold, then the given face is considered to belong to the user.
EigFaces::recognizeFace - This method is an extension of verifyFace in that instead of checking only one user to verify a face, it searches through the entire user database, using verifyFace to calculate the MSE between the given face and each user. It then returns the top n users with the smallest MSE, which means theses users most closely match the input user.
EigFaces::findFace - Finally, there is findFace is, which is tasked with finding n faces within an image. It does this by scaling down the image based on user input, and then iterating through each pixel, it calls isFace. Depending on what the user has specified, it returns a cropped image or an image with the faces marked.

Experiment: Recognition

Methodology

Realizing the large number of experiments that needed to be accomplished, a wrote a bash script that called eigfaces to compute 10 eigefaces and to compute from 1 to 33 eigenfaces at a step of 2. At each step, the script then compueted the associated userbase. Next, the script called recognizeFace on each smiling user and directed the output to text files. To count the number of correctly identified users, I wrote a Java program to parse the text files and return a count of correctly recognized faces.

Questions

From the graph, I see that there is a sharp drop off of correctly recognized faces when the number of eigenfaces is below five. Above five and the results are largely are par with each other for each number of eigenfaces. Interestingly, there is a slight drop between 15 and 20 eigenfaces. I would have expected an overall positive trend as the number of eigenfaces increase and maybe a slight drop toward the high end since as I included more eigenfaces, more variation is included, which should allow for more room for faces to be identified. Given these results, it's hard to determine what number of eigenfaces is best to use.
Below is one of the recognition errors I had where the program tried to identify smiling-12 and gave back neutral-20 with an MSE of 27944.4. Looking at the original images, the error looks somewhat reasonable given the hair-line, simliar shaped eye brows, and similar shaped eyes. Looking at the reconstruction, it becomes a little more plausible that these two images could have been mistaken for one another when you look at the facial features I mentioned earlier. On the brightside, when I tried recognizeFace using different numbers of eigenfaces, the correct person would usually be in the top three results.

Here is another example recognition error I had between smiling-15 and neutral-1, but this one has a much higher MSE of 88884. This one looks less reasonable than the first error and the higher MSE reflects that. We can see some simliarities in the projects like the appearance of a beard, but not much else. In this case, the top few results returned by recognizeFace didn't change much, but the correct face usually did not even show up in the top 15 results.

Discussion

Like I mentioned above, I would have expected the graph to have an overall positive trend with maybe a slight down fall toward the end. Since the eigenfaces represent the variances of the training, I think that with too few eigenfaces each face needs to resemble the average face in order to be recognized. Whereas with too many eigenfaces, there is too much variation to accurately identify faces. Further tests would probably need to be done. There was a case where I was trying to run findFaces on the elf.tga image. At 10 eigenfaces, it was able to correctly identify the face of the baby. On the other hand, at 30 eigenfaces, it found an arm instead. With too many eigenfaces, smooth faces may actually start looking less like faces to the algorithm. So the issue of choosing a good number of eigenfaces can be difficult would depend on a number of factors like the size of your training set, how representative they, and how well are they normalized to line up facial features. In regards to my anecdote, I think it would be worth running further experiments with findFace to weigh the number of faces it can find at each number of eigenfaces, and if I had more time, I would have done so.

On the topic of my false matches, I think the second example shows some limitation of the algorithm. The smiling face has a head that's tilted upwards and a gaping mouth that takes up most of the image. A face that was made to look less like a typical a face would fail under this algorithm.

Sample Results

A plot of the number of correctly identified smiling faces for several different number of eigenfaces.

Experiment: Find Faces

Methodology

For the elf.tga picture, I ran findFace with the parameters as directed. For the image of myself and the group photos, I ran a script that executed findFace at individual scales so that I could get an idea of what scales produced correctly identified faces. After getting a rough Idea of the range of scales, I ran findFaces again on each image, this time using a range of scales.

I also tried the technique suggested in the write-up where I multiply the MSE by the face's distance from the average face and divide by the variance. In all of my cases, this resulted worse results or results that were on par without using the technique. In the following results, I'll present each image, answer the questions and discuss what I think may have gone wrong.

Results

Here are the results from running findFaces on elf.tga using the directed parameters. The algorithm was able to find and crop the baby's face as expected. I also tried marking the image to find more faces, but it never found the man's. What usually resulted was the marking of an arm or shirt. It seems possible that his slightly tilted head, glasses, and smile drastically affected how his face was projected onto face space.
For the image of myself, I was never able to find my face no matter what scale or number eigenfaces I tried. This image was produced using 10 eigenfaces with a scale that ranged from .4 to .6 with a step of .01. The nonface actually looks like a face in the project probably because of the average face that's added in. Unforunately, I think there may be a bug in my findFace algorithm and how it orders the results. If I run isFace on the non-face image and the face images, the face image has a lower MSE and should have been picked over the non-face. This must be a corner case because it didn't show up on my other tests.
Finally, here are the results of the group photos. For the in-class photo, I used a scale from .7 to .8 with a step of .01. For the group photo of Microsoft employees, I used a scale from .5 to .6 with a step of .01. With the in-class photo, I was only able to find all three faces when I expanded the number of marked faces to four instead of just three. The guy's shirt somehow took precedence over my own face in the group. I think they maybe related to the bug I mentioned in the previous discussion. The Microsoft photo was fairly successful in finding seven of the eleven faces.

Experiment: Verify Faces

Methodology

To accomplish his experiment, I first had to compute six eigenfaces and then generate a userbase using those eigenfaces. Next, I wrote a bash script to automate execution of the program with the verify face flag. There were two loops. The first verified the non-smiling faces with their smiling counterpart. In the second loop, I tried verifying each student with a smiling face that was not his/her own. The bash script also redirected all the output from the executable to a file.

I now had severval text files that needed parsing so I wrote a Java program that read the output of the executable and counted whether a given face was said to match a smiling face. So steps I took were to decide on a threshold, execute the script given that threshold, and parse the files.

Questions

I began with a threshold of 60000, executed my script, and parsed the files to get the count of matches. I made the Java program such that it would also output the minimum MSE that was above the threshold used and I would use this value and the number of matches to inform my choice for the next threshold to use. That value would allow for at least one more face to be correctly identified and I would have to weigh that with the new number of false positives the new threshold would result in. Overall, I tried thresholds 47000, 50000, 60000, 70000, 80000, 90000, and 100000.
There were actually two thresholds that appeared to work fairly well. At 80000, I had 7/33 false negatives and 1/33 false positives. Increasing the threshold up to 90000 resulted in 5/33 false positives and 2/33 false negatives.

Discussion

As you increase the MSE, you're allowing more room for error between the projected nonsmiling and smiling face so this experiment was intended to find a balance between the number of false negatives and positives. As you can see from the graph below, increasing the threshold would have the expected result of decreasing the number of false negatives while allowing more false positives to creep in. It can be hard for the algorithm to tell sometimes. The strongest false positive I had was between neutral-33 and smiling-3, which had an MSE of 28645.6. If you look at their reconstructions, they're very similar and actually look similiar to the average face as well.

Sample Results

Above you can see the results of the all the thresholds I tried.

From left to right: average face, smiling-3, neutral-33, reconstruction-3, reconstruction-33. The reconstructions are quite similar and one can see why the algorithm might identify them as being the same person.