Student Name - CSE 455 Project 4

Project Abstract

Objectives

In this project, I created a face recognition system. My program uses PCA (Principle Component Analysis to find a space of faces. This space is spanned by just a few vectors and this means that each face can be defined by just a set of coefficients weighting these vectors. The objective of this program is that it will be able to detect and verify faces in images after building a space of input face images.

Challenges

This project had many challenges. Some of the challenges were due to simply implementing complex equations to do PCA.

Lessons Learned

As a result of this project I have learned that it is best to implement things in the most straight forward way. Attempting to optimize early or combine many complex steps lead to more headaches than improvements.

Implementation

This breaks all of the steps of PCA and face recognition into many different functions. This modularity grants itself to the reuse of functions and expandibilty of the application. Functions for finding the average face and the eigenfaces were implemented as instructed by the formulas listed on the write-up and the lecture slides.

The project is structured as follows:

Faces::eigenFaces -This is one of the lager methods. In this method we find the average face of the input set. Here we use PCA to generate eigenfaces from the set of user faces. This method takes "n" a number of eigenFaces to compute, and "results" an EigFaces references where the "n" computed eigenfaces are stored.
EigFaces::projectFace - This method is short. It projects a face on the face space to generate a vector of n coefficients, one for each of the n eigenfaces. The coefficients are found by first shifting the input face by the average face. Then the coefficient for each eigenface is found by taking the dot product of the shifted face and the eigface.
EigFaces::constructFace - This method is short. It constructs a face from a vector of coefficients computed in projectFace. The coefficients are an input parameter and the constructed face is an output parameter. The constructed face is computed by summing the eigfaces multiplied by their related coefficients.
EigFaces::isFace - This method is short. It decides whether an image is a face or not. It works by projecting the face onto the face space and then reconstructs the face and computes the MSE (Mean Square Error) between the reconstructed face and the original face. This method is implemented by calling projectFace() to compute the coefficients for the face and then passing those coefficients into constructFace() to reconstruct the face using the eigfaces of the space. Then it uses mse() to compute the MSE between the given face and the reconstructed face. The input parameters for this function are a face, a double that represents the max MSE and an output parameter for the computed MSE. If the computed MSE is less than the Max MSE then the function returns true.
EigFaces::verifyFace - This method is short. This method decides if a face is a specific user's face. It works by computer the MSE between coefficients computed by projecting the face onto face space and the user's coefficients. It returns the MSE in the output parameter. This function takes a face, a vector of user coefficients, a max MSE and an output parameter for MSE. This method is implemented by making a call to projectFace() and then finding the MSE between the input coefficients and the computed coefficients. If that MSE is less than
EigFaces::recognizeFace - This method is rather short also. This method is similar to verifyFace, but instead it sorts the userbase by closeness of the match with the face. The userbase is a set of names and coefficient vectors. It computes the MSE for each user in the userbase and then finds the best match by sorting it and looking at the first one. This method takes a face and coefficients of the users. I calls verify face to compute the MSE for the face against each of the user faces. Finally calling sort to order the MSE to find the closes match.
EigFaces::findFace - This method is the longest method. This method searches an image and find the "n" best faces. It will either draw green squares around the found faces or crop the image to show only the best matched face. The method takes an image, a min and max scale, a step for that scale, a number "n" for the number of faces to find, a bool representing to crop or not, and a result image that is a return parameter. The method uses the input scale and step to scale the input image to various sizes. It then takes a window that is the size of the computed eigfaces and steps that window over the entire image checking if that window is a good match for a face. It calls isFace() to compute how close that window is to matching a face. It keeps a sorted list of the top "n" faces found and it also ensures that no found faces are overlapping. By looking at different scales it allows the method to find faces at many scales in the image. If crop is set to true the method sets result to the copped image of the best face, if crop is false then it draws a green box around the "n" found faces.

Experiment: Recognition

Methodology

I used this command "main --eigenfaces 10 25 25 nonsmiling_cropped/list.txt eig10.face" to compute 10 eigfaces from the cropped_nonsmiling input images. I then used this command "main --constructuserbase eig10.face nonsmiling_cropped/list.txt base.user" to construct a userbase of coefficient vectors. I then used this command "for %%A in (smiling_cropped/*.tga) do (main --recognizeface smiling_cropped/%%A base.user eig10.face 1)" to attempt to match all of the faces. These three commands were used to accomplish the required experiments.

To experiment with how the number of computed eigenfaces affects the recognition of faces I wrote a script that ran the three commands above many times. I ran it using the mean face plus 1 through 33 eigenfaces at a granularity of 2. After running the long script I then looked over the results to find how many faces matched and how many did not match. Then I used excel to plot the information into a graph.

Questions

Question 1: The graph shows how the number of positive recognitions is directly related to the number of eigenfaces computed. As the number of eigenfaces increases the number of positively recognized faces also increases. This graph shows a peek at 11 eigfaces and then it dips a little and shows little improvement as the number of eigfaces increases. This is likely due to many of the smiling_cropped faces being too different from their nonsmiling_cropped counter part so additional eigfaces do not help too much. The goal is to find the minimum set of eigenfaces to best recognize all of the faces. You want to minimize the data you need to store by only storing a few eigenfaces and then storing coefficients that can be used to compute all of the input faces used to create the eigenfaces. The coefficients are vectors of length n where n is the number of eigenfaces so these do not take up much space. By storing fewer eigenfaces you save on the information computed and stored but you will miss some matches. However if you keep many eigenfaces you will recognize many faces but you will also waste a lot of space and if you keep an eigenface for all of the input faces then it is similar to just keeping all of the original faces and directly comparing them to recognize them. For this data set the best number of eigfaces is 11. 11 is a reasonably low number of faces and gave the best results.

Question 2: There were many mismatched images in step 3. Some of the mismatches seemed very reasonable. The images mismatched actually looked very similar and the input image. A good example of this is smiling-24 matched to neutral-18. These images have similar framing and look alike. They are shown and discussed in the discussion section. The actual match for these images did rank high in the sorted results but had a greater error than this mismatch.

Discussion

Increasing number of EigFaces Results

Average Face

10 EigFaces

Matches/EigFaces Plot

Here is the plot of how increasing the number of eigfaces increases the number of correct matches. This shows that a very low number of eigfaces produces few matches. I found that 11 eigfaces yielded the best results, finding 24 out of 33 faces. Adding more than 11 eigfaces never gets above 24 faces. This is because of the input set and the set that it is being matched against. Some of the images are drastically different between the two sets and these images are hard to match. Some of the images are also very similar to other peoples faces and so these match incorrectly also.

Mis-Match Results

Here is the correct match.

The above result is interesting because at a glance these are two images I might match. They look very similar to me. The composition and the expression in these two images are very similar. It seems very reasonable that these two images matched and that they matched better than the correct answer matched. The correct answer is more different than the incorrect answer because the mouth is more open and this drastically changes the layout of the expression, making it harder to match. If the input set had included more smiling images then the correct match may have been found.

Experiment: Find Faces

Methodology

To do this experiment I created the 10 eigfaces first. Then I used findFace with crop=true to crop the face of elf.tga. I then found images of myself online and attempt to crop them trying various scales and steps. I then ran a script looking for the best scale to use for the sample group images. I tested that I could find all 3 faces in the group1.tga photo. I then tried finding faces in some photos of my family.

Questions

Test 1: Elf.tga

Test 2: My Wife and I in our car

Test 3: group2.tga

Test 4: Family Photo

Test 5: Large Group Photo

Question 1:

For Test1 I used: min_scale = .45, max_scale = .55, step = .01 .

For Test2 I used: min_scale = .2, max_scale = .3, step = .01 .

For Test3 I used: min_scale = .3, max_scale = .8, step = .1 .

For Test4 I used: min_scale = .2, max_scale = .3, step = .05 .

For Test5 I used: min_scale = .1, max_scale = .2, step = .02 .

As I experimented with different testing scales for the group images I found that most faces would match between .5 and .7. Because all of the faces are relatively the same distance from the camera these min and max worked for all of the group photos.

Question 2: My attempts did find false positives and false negatives. In the photo of my family there are both false positives and false negatives. I thing that the false negatives come from the fact that I am the only one in my family who has their face in the input data set that the eigfaces are computed from. So it will be harder to match their faces. Also the family members in the photo are of varied age and the input photos are of mainly college age kids. So the under represented age of the face in the data set can lead to actual faces being missed. The angle of the faces in the group photo also lead to there being false negatives. Also the difference in lighting in my family photos as compared to the lighting in the input photos will cause there to be false negatives. The lighting across the input photos is fairly standard so it will have a harder time matching faces that have different lighting. I found some false positives in the family image and in some of the group shots. It seems to favor things that are kind of vertically banded. If there is a dark band and then lighter bands it often identifies this as a face. I think this is because many of the input faces have dark hair so that makes a dark band on the top match to hair. Also somewhat symmetrical shapes will match because the input faces are roughly symmetrical. Bellow I have posted an image from the groups that has a false positive on the girl on the lefts jacket.

False Positive: group2.tga

Discussion

The results of this experiment are about what I expected them to be. I expected that it would do a pretty good job of matching faces that were in the input set even if they are now being checked in a group photo. I also expected that images of people who were not in the input set would have a little harder time matching.

Experiment: Verify Faces

Methodology

To do this experiment I created the 6 eigfaces first using the cropped non-smiling faces. Then I created a userbase of the images. Then I went through and tried various images to see that they would verify against themselves and to see if they would verify to be other students. I tried many different rates until I found a good lower bound on the MSE.

Sample Outputs

True Positive

False Positive

False Negative

Questions

Question 1: I started at 1000 and started working my way down. But I found that 1000 was far too low. I had to go way up to 80000. . I found that 65000 produced the fewest number of false positives and false negatives in my testing. I wrote a script that tried to verify every smiling_cropped/smiling* against every neutral* image. I found that when there was not a drastic difference between the smiling and the neutral images that 65000 as the MSE allowed for it to accurately match the many faces although it did have a few false negatives and false positives. Numbers lower than 65000 missed many true positives.

Question 2: Using a MSE of 65000 I found 20 true postives, 12 false negatives and 12 false positives.

Discussion

These results were a little suprising. Because of the vast variation between peoples faces in the neutral set as compared to the smiling set there was a hi MSE needed to get so many true positives to pass. There were only a few faces that matched their smiling face with a low MSE of around 20000-30000 and there were a few faces that matched other faces that were false positives with an MSE around 30000. This variation is due mainly to the image sets and people turning their faces or having similar features, like hair, glasses or facial hair.