Computer Vision (CSE 455), Winter 2003

Project 4: Eigenfaces

Assigned: Thursday, Feb 27, 2003
Due: Wednesday, March 12, 2003 (by 11:59 pm)

Synopsis

In this project you will create a facial recognition system. The program reduces each facial image to a vector, then uses Principal Component Analysis to find the space of faces. This space is spanned by just a few vectors, which means each face can be defined by just a set of coefficients weighting these vectors.

You are given a skeleton program that provides a sophisticated command-line interface as well as most of the image and vector classes and operations you will need. Your job will be to fill in the functions that do PCA, projection into face space, determining if a vector represents a face, verifying a user based on a face, finding a face match given a set of user face information, and finding the size and position of a face in an image.

Principal Component Analysis

PCA is a technique by which we reduce the dimensionality of data points. For example, consider the space of all 20 by 30 pixel grayscale images. This is 600 dimensional space because 600 data points are required to represent the values of the 600 pixels. But suppose we only consider images that are valid faces. Such images may lie in a small subspace of the set of all images. If this subspace is a hyperplane spanned by k vectors, then we can represent any point that lies on this plane with only a set of k coefficients.

To use PCA to do pattern recognition you first take a representative sample of data points and find the subspace with the desired dimensionality that best fits the sample data. This step needs to be done only once. Once it has been done, you can determine if a new point is a valid data point by projecting it onto the subspace and measuring the distance between the point and its projection.

To use PCA for facial recognition we must represent each face image as a vector of pixel values. To generate this vector the face image must be cropped and scaled, and its intensity must be normalized. The cropping can either be done by hand or by searching images for faces. This second option is only possible once the face recognition algorithm has already been trained with some input faces.

Input Data

The input data you will be using is divided into several sets. They are included as folders in the skeleton code zip file:

class_smiling_cropped: Already cropped images of smiling vision students, taken in class.
class_nonsmiling_cropped: Already cropped in-class images of vision students wearing their normal sullen expressions.
class_ugrads_cropped: The images of most of the students in the class from the undergrads database, already cropped.
ugrads_cropped: The first (alphabetically) hundred undergrads, already cropped.
ugrads_uncropped: 391 uncropped undergrads. You can try cropping some (or all) of these using the face finding algorithm you will write.

Skeleton Code

eigenfaces_skeleton.zip: Skeleton code
files.zip: All the images (66.6 MB). You should unzip them and put them in the files folder of the skeleton code.

The skeleton code is large, but please take the time to get some familiarity with the classes and methods it provides. There is a lot of useful functionality included, like vector and image operations, which would take a lot of time to write yourself. The program uses a simple command line interface, and it is simple to add your own functionality in case you want to tackle some extra credit, or add commands to facilitate your experimentation. At the minimum you will only need to modify two files, faces.cpp, and eigfaces.cpp, but you will still need to understand the contents of most of the other classes.

Classes

Here is a description of the classes in the skeleton code. You will need to read this page to understand the organization of the code. It also gives information on numerous methods that you will need to call in your code, so please read it very carefully.

Functions to code

All the required work can be done in eigfaces.cpp and faces.cpp. The functions you must fill in are members of classes Faces and EigFaces. A Faces object stores a set of user faces loaded from images. EigFaces is derived from Faces. An EigFaces object stores a set of eigenfaces as well as the average face.

Most of the methods you will have to implement are called from main.cpp, so you should study main.cpp closely to see how these methods are used.

Each of the items below have an upper bound on the number of lines of code needed, not including comments. If you have many more lines of code for a given step than the upper bound, then you may be doing something wrong, or at least making more work for yourself. Be sure to look in vector.h, image.h, and face.h to see what useful functions are already provided for you.

Note that points will not be taken off for making your code longer than necessary, as long as it is correct and well coded. However, you should try your best not to go too far over the upper bound, as it means unnecessary work for you and will make your code more difficult to read and harder to grade.

Implement Faces::eigenFaces. This method uses PCA to generate eigenfaces from the set of user faces. It takes two parameters, n, the number of eigenFaces to compute, and results, an EigFaces reference parameter where the n computed results should be stored. Recognition slides 17 to 19 show how this computation is done. Use Jacobi::jacobi() to find the eigenvectors, almost exactly like you did in project 2. This time instead of just looking for one eigenvector, you are looking for the top n eigenvectors, so you will need to sort the eigenvectors by eigenvalue. The skeleton provides a function sortEigenValues which sorts the eigenvalues and returns an ordering so you can find the positions of the top eigenvectors in the matrix. However, you can make your own sort routine if you prefer. Once you've computed the ordering, put the top n eigenfaces into results.

Hints: You will need to store the average face in the results by calling setAverage. The matrixes and vectors jacobi uses always start at one, so you will need to adjust for this when going between these and the Vector objects used everywhere else in the program.

Lines required: 36
Implement EigFaces::projectFace. This method projects a face onto the face space to generate a vector of k coefficients, one for each of the k eigenfaces. The parameters are Face, the face to be projected, and coefficients, a reference parameter in which you must store the coefficients computed.

Lines required: 5
Implement constructFace. This method constructs a face from a vector of coefficients. The coefficients are given in a parameter and the constructed face should be stored in the reference parameter result. You should use only the first k coefficients to produce the reconstruction.

Lines required: 6
Implement isFace. This method decides whether an image is a face. It works by projecting the face onto face space and computing the MSE between projection and original. The parameter max_reconstructed_mse is the threshold to use in making the decision. Please return the computed MSE in the reference parameter mse.

Lines required: 7
Implement verifyFace (extra credit). This method decides if a face is a given user's, given coefficients computed from a different face image from that user. It works by computing the MSE between coefficients computed by projecting the face onto face space and the user's coefficients. Return the MSE in the mse reference parameter

Lines required: 4
Implement recognizeFace. This method is like verifyFace, but instead sorts the userbase by closeness of the match with the face. The userbase is simply a set of names matched with coefficient vectors. Set the mse for each user in the userbase, then sort it. The best match will then be the first user in the userbase.

Lines required: 8
Implement findFace. This method searches an image and finds the first n best faces. A struct FacePosition is included to help you out. You can use std::list to keep a list of the top best face positions. Whenever a position is found that is better than the last position in the list, insert it into the proper place in the list such that the list stays sorted, and remove the last element in the list. The exception is if there is already a face position in the list that overlaps the one you just found. In that case you must choose the one with the better face and throw the other away.

If the parameter crop is true, then you should return the image cropped to the best face in result. Note that this result must be the same scale and resolution as the original image. If crop is false, then draw bright green boxes around each found face in the original image and return this in result. You can assume that crop will only be true when n is one.

The function should search various scales by scaling the input image for every scale between min_scale and max_scale (simply call the resample method to scale an image). The step parameter controls the distance between scale factors searched. For example, with arguments min_scale=0.1, max_scale=0.5, and step=0.05, the image scales searched will be 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45 and 0.5. You should return the best x and y face position and scale in the reference parameters provided.

Hints: To minimize processing time, only consider faces completely inside the image. Put the scaling in the outermost loop so you only have to resample the image once for every scale factor. You might want to construct and output a debug image that shows the score for every pixel position. Since your scores probably won't happen to be between 0.0 and 255.0, normalize the debug image before you save it.

You may have to do some special processing to prevent the algorithm from getting fooled by low-texture areas or areas close to face space but far from the facial mean. I found that multiplying the MSE by the distance of the face from the face mean and then dividing by the variance of the face works pretty well. Maybe you can come up with something better than this for extra credit.

A good debugging technique is to find the correct face scale yourself, and then run the search for only that scale. Once you get it working like that, then move on to getting your code working for a range of scales.

This is a complicated problem, so don't expect to get perfect results. As you can see, I'm the only CSE 142 TA without a face:

Lines required: 78

Hints, Tips, Requirements, Warnings

Here are general tips, requirements, and things to watch out for that may come in handy as you are coding your project or running your experiments.

See the usage function in Main for a list of commands and their parameters. Or you can just run the program with no arguments and this usage info will be output. There are examples of every command listed in ReadMe.txt. If you add any of your own commands please modify the usage info and ReadMe.txt.
Use eigenface width and height of 25 by 25 or greater for your experiments. You can use smaller eigenfaces such as 15 by 15 for debugging so you won't have to wait a long time. Note that if the eigenfaces you use are too large, Jacobi will run too slowly to get results in any reasonable number of days (or even years!).
Always resize a Face to the proper size before loading a targa file into it. You don't need to worry about this when loading into an Image, as images just resize automatically to the size of the loaded image.
If you do anything non-obvious in your code, comment it. Otherwise your TA won't be able to figure out what you were trying to do and won't be able to give you partial credit if it has a bug.
To aid you in debugging it is helpful to output debug images that you can look at and verify that the image contains what you expect. Use the saveTarga function to save Image or Face objects to file. For images such as eigenfaces which may not have pixel values arranged neatly from zero to 255, you will have to normalize to between 0.0 and 255.0 first by calling the normalize method.
The working directory when running from inside Visual Studio is eigenfaces\files. Thus any filename arguments you give should be relative to this directory. Of course you can change the working directory if you want to.
It is recommended that you don't try computing eigenfaces or doing face detection in debug mode, your code will run too slowly. If you need to debug something you might want to modify the code to skip to the point you want to debug so that you don't have to wait too long.
Your turned-in executable MUST be compiled in Release mode, NOT debug mode!

Experiments

You must complete all of the following experiments to test your code. The results of each experiment, including the answers to all the questions, should be written in your Write-up, along a discussion of why you think the experiment turned out the way it did, as well as any unexpected results.

Note that some of the experiments may require you to run a command over and over again. One way to accomplish this is to use a batch file. Or, if you wish, you can easily add your own command that takes the place of a string of commands. See the description of class Main above to find out how to do this.

Testing recognition with cropped class images

Procedure

Use the cropped, non-smiling students (in class_nonsmiling_cropped) to compute 7 eigenfaces.
Use the same set of images to compute a userbase.
Have the program recognize the cropped, smiling student images (in class_smiling_cropped) in the smiling userbase.

Questions

How many faces did the program recognize correctly? Incorrectly?
For instances where the program was wrong, what was the average position of the correct answer in the list of closest matches?
If you recognize a female face, will the second, third and fourth matches usually be female? How about for male faces? Give supporting data.
In a real-life scenario with thousands of users would you use the entire user set to compute the face space? Why or why not?
Why might it be better to use a face set independent of the user set to compute the eigenfaces?

Recognizing the undergraduate faces

Procedure

Use the cropped undergraduate students (in ugrads_cropped) to compute 12 eigenfaces.
Use the small set of class undergraduate images (in class_ugrads_cropped) to compute a userbase.
Have the program recognize the cropped, non-smiling student images in this userbase.
Repeat your experimentation with 5, 10, 15, and 20 eigenfaces. Remember to regenerate the userbase each time.
Why is it best to use as few eigenfaces as possible while still getting good results?

Questions

Of the students in the class set who are also in the class undergraduate set, how many did the program recognize correctly? Incorrectly?
Are the incorrect identifications reasonable? Do they look similar to the actual person? Give some example images.
Why does the program perform more poorly in this recognition task than the previous one? Give at least three reasons.
How did changing the number of eigenfaces used change your results? What number worked best?

Cropping the undergraduate faces

Use the eigenfaces file computed in the previous problem
Use your program to crop at least ten of the uncropped undergraduate images
Experiment with min_scale, max_scale, and step parameters to find ones that work robustly and accurately without taking more than 10 seconds or so to run on each image

Questions

What min_scale, max_scale, and scale step did you end up using?
How many of your crop results look correct (cropped to the same part of the face as the pre-cropped images)? How many look incorrect?
What is the problem with using a min_scale that is too small?

Finding faces in a group photo

Procedure

Find two group photos, one of family or friends and one from the web of TV or movie characters, or famous people. Each image should have at least four faces.
Use your program to find the faces in the photos (use the crop=false option)
Does the program tend to think any non-face items in the image are actually faces? Give some examples.

Questions

Show the results of the face detections
What min_scale, max_scale, and scale_step did you use for each image?
If there are any errors, explain why the program might have failed and how you could improve the input or the algorithm to correct this.

Write-up

Your write-up should be an HTML document. Besides describing the results of your experiments and answering the questions, you must also include sample output images. Make sure any images that need it are normalized so they can be seen clearly. As a minimum you should include:

The eigenfaces created from class images and the undergraduate images along with the average faces for these sets.
Your two images showing the results of face detection
Some results of successful and unsuccessful automatic image cropping attempts
Some examples of image recognition attempts that failed (if there were any) and some that succeeded.

Feel free to put your write-up in multiple files if you want, but make sure that they are all accessible via links from the main page.

Extra Credit

Implement morphing. Make a video from your morphed faces using Adobe Premier. Try morphing an image away from another image instead of towards it.

Try using eigenfaces to recognize images other than faces or to recognize animal faces.

Experiment with using different numbers of eigenvectors. Find the minimum number required to still get good reconstruction results.

Implement verifyFace as described in the list of methods to implement above. Perform the following experiment:

Procedure

Use the cropped, non-smiling students to compute 6 eigenfaces.
Use the same set of images to compute a userbase.
Have the program verify the cropped, smiling student images in the smiling userbase. Test by verifying each student against his or her smiling face, as well as each student against a smiling face which is of someone else.
Experiment with various thresholds to find the one that gives the fewest false positives and fewest false negatives

Questions

What MSE thresholds did you try? Which one worked best? What search method did you use to find it?
Using the best MSE threshold, what was the false negative rate? What was the false positive rate?
In a real-life verification scenario, why might it be better to have a low false positive rate than a low false negative rate?

Replace Jacobi with a routine that is fast for the huge matrices we are using. Note that since we're only using the first few eigenvectors it is a waste to compute all the others as well. You can look into using routines that only compute the eigenvectors with the largest eigenvalues. A good starting point is this copy of Numerical Recipes in pdf. You can copy and paste C code directly from the pdf files. Show some results of computing nice large eigenfaces! How big can you reasonably get before things get too slow to be practical again?

Try using PCA to do voice recognition. Talk to Dewey if you would like to borrow his voice recognition book, or a copy of the TIMIT speech database for testing. Credit will be assigned based on the quality of the result and amount of effort put into the project. Dewey did voice recognition work at JPL, so ask him if you have questions.