CSE557:

Morphing

 

By Michael Eckert and Mirco Stern

 

Instructor: Brian Curless

 

INTRODUCTION

 

For our final project, we developed an application to create a morph effect between two still images. We then further enhanced our application to allow a morph between two videos.

 

Here’s an example how we can improve Mirco’s looks within a few seconds:

click to play video 

 

After achieving these promising results and motivated by the cool effect we decided to enhance our program to handle morphing between two videos.

 

For everyone who missed the famous Michael Jackson video “Black or White”, here’s a remake of the best part of the video using much better actors, of course:

click to play video

 

Creating these morphs has been a lot of fun. Below we give some details about the algorithm we used and its implementation. We then describe the user interface of our program and some of its features. Finally, we end the boring part of this text by showing our amazing results.

 

Algorithm

 

Our implementation follows the morphing algorithm Beier and Neely describe in [1]. To morph between two images, the user has to identify corresponding features in the two images. The corresponding features are specified using user-supplied pairs of feature lines as shown in Figure 1.

 

figure 1

Figure 1

 

Beier and Neely’s method divides morphing into two sub problems. The first component only distorts a single image according to a set of feature line pairs. The image will be warped in a way that its transformations of the original feature lines align with their partners in the distorted version.

To obtain a morph between two input images, we distort both into a similar shape and then simply cross-dissolve them.

The second component defines this intermediate shape by interpolating the new set of feature lines that will be used to yield the distorted versions of the input images.

We first shortly review those two components following the order the morphing process actually uses them, i.e. starting with the second. Then we will give turn to some of the interesting implementation issues.

 

Feature Line Interpolation

 

We implemented two different methods for interpolating feature lines. The first one is a simple linear interpolation of the start and end points of each line pair. Although very easy to implement this approach has a major drawback: if the corresponding feature lines undergo a significant change in orientation, their interpolated versions look can be heavily contracted. Thus a distortion using this interpolation would be contracted unnaturally.

To avoid this disturbance Beier and Neely suggest an alternative way to interpolate the feature lines. Instead of interpolating the end points, we interpolate a line’s center point, its orientation and length.

In practice, we found that cases in which the orientation of features changes drastically between the two input images are rare, so the difference between the two interpolation methods is not very significant.

 

Distortion Algorithm

 

The distortion algorithm takes as inputs the original image, the original image’s feature lines and the corresponding feature lines that we want to match. It then decides where each pixel in the output image will be sampled from in the source image. This process is also called backward mapping, as we will iterate over the output image’s coordinates.

Beier and Neely’s basic idea is to calculate a point’s relative position with respect to a feature line. Using this relative position with respect to the corresponding feature line in the source image, gives us the coordinates to sample from.

For multiple feature lines, the algorithm will weight among the displacements the single lines give for a point. We will not reproduce the complete description here, it can be found in [1]. Also the section reviewing Beier and Neely’s approach in [2] is helpful.

 

Implementation Considerations

 

Transforming a single pixel’s coordinate using the algorithm of the preceding paragraph leaves the problem how to sample from the original image. Performing pure nearest neighbor sampling will result in ugly aliasing artifacts.

Instead of sampling points, it is easier to think of sampling a whole rectangle. We transform the rectangles into arbitrary quadrilaterals in the source image by simply transforming the corners. The quadrilateral is only an approximation of the transformed rectangle but works very well in practice.

Working this way we can just transform a regular grid of quads. We paint the output image by drawing the regular grid of quads and texture mapping these quads with the original image using texture coordinates from the transformed grid.

With this method we leave sampling issues to the graphics hardware resulting in a major speedup and easier implementation. Also we can adjust the grid from fine to coarse and thus can have a tradeoff between accuracy of the output and rendering time. We can use a coarse grid while specifying feature lines to get a quick preview of our results and use a fine grid for the final rendering. For our results we used a grid size of 2x2 pixels.

 

 

Video-to-video Morphing

 

Adding the capability of dealing with videos contains two challenges: First of all, having to set feature lines in every single frame of the videos would be immense work and error prone. We solved the problem by requiring the user only to set feature lines in some key frames. The program interpolates the lines for the frames in between from the two key frames next to them. This comes at the cost of having to use the same number (and basic position) of feature lines throughout the video, which we found to be a noticeable restriction.

We decided to use simple linear interpolation between the key frames. Depending on the frame rate of the underlying video the rate of change between different frames is pretty high, so no interpolation would place the lines appropriately over a longer distance. Moreover, more sophisticated interpolation schemes, as say C2 continuous Bezier curves, result in a loss of local control. Local control, however, is very important for the user. A linear interpolation scheme provides local control and is more intuitive for the user.

The user can work on a video-to-video morph using a “divide and conquer” approach: start the first and last key frame and insert additional key frames where necessary. In our implementation the user does not have to place the feature lines from scratch, but starts with the set of interpolated feature lines making adjustments.

 

The second challenge in implementing video-to-video morphing are the memory requirements of the program. In our implementation we input and output videos as numbered stills in the Windows-BMP format. It is not possible to keep all images in the memory at once. We repeatedly fetch bitmaps from the hard disc and release them later. Using a simple caching approach, we do not free the memory of an unused bitmap right away but instead keep and only discard a bitmap if the have to make space for a new bitmap. Our implementation uses a maximum number of only 5 per video; a higher number does not seem to yield significant improvement, as the user usually only works at two or three key frames at once.

 

 

Our User Interface

 

The observation that motivates our UI’s design is that there is little reason for editing feature lines on more than one pair of key frames simultaneously since this is all that will contribute to the result. An example for our UI in action is given in Figure 1. We used a browser to manage the set of already edited pairs of key frames.

 

Feature lines are placed using the mouse. Creating a new line is done via dragging it in one of the input image windows while holding down the shift key. The line’s companions will be placed automatically at the same position in the image’s companion and also all other key frames. The line can be edited by dragging its endpoints. A line can be deleted by clicking one of its endpoints while holding down the control key.

 

Our program is capable of saving and loading feature line sets for reuse or later refinement, which turns out to be very important as editing a complete video can take a while.

 

A further very useful feature is the capability of showing just the warp of one of the input images without blending the second one on top of it. This is very helpful for “debugging” feature lines as it is sometimes hard to see from the morphed image where a weird look originates.

Furthermore, the ability to render a single current picture turns out to be crucial for editing Feature Lines.

 

Last but not least… Results

 

As we already mentioned, playing with our program is a lot of fun…

click to play video
click to play video click to play video click to play video

 

 

 

 

REFERENCES

 

[1]  Thaddeus Beier and Shawn Neely; “Feature-Based Image Metamorphosis”; In SIGGRAPH ‘92

[2]  Robert Szewczyk, Andras Ferencz, Henry Andrews, Brian C. Smith; “Motion and Feature-Based Video Metamorphosis”; ACM Multimedia 97 - Electronic Proceedings; see http://www.cs.cornell.edu/zeno/projects/vmorph/MM97/VMorph-MM97.html#bna