[Next] [Previous] [Top]

Performance Evaluation of Vortex-Compiled Applications

1 Maximizing Normal Application Performance


Vortex's default initial compilation mode is non-optimizing compilation (o0); to enable optimization the user must set the compiler option[1] optimization_level (by convention, larger values denote more aggressive combinations of optimizations). Typically, this is done by issuing the o1 (sets optimization_level to 1) or o2 (sets optimization_level to 2) commands at the Cecil> prompt. During application development, we will typically have most of the application's source files already compiled with optimization while those files that are actively being edited/debugged will be compiled without optimization to minimize turnaround time and maximize debuggability. Periodically (over lunch for example) we utilize the pmakeo2 Vortex command to recompile with optimization all files that are currently unoptimized.

There are a number of compiler options that control what optimizations Vortex performs during optimizing compilation. They default to settings that we have found most appropriate in our daily use of Vortex. Thus, the typical user can simply choose a built-in combination of optimizations (o0, o1, or o2) without needing to fine tune other optimization-related compiler options.

Two (often important) optimizations require additional effort by the user, however. Profile-guided class prediction can be quite effective for some applications, but requires that the user provide profile-derived class distributions to guide the optimization, We first describe how to gather these distributions from programs and then describe how to make them available to Vortex for exploitation.

Unfortunately, gathering profile data is a somewhat tedious process. Suppose we wanted to gather profile data for my_program.cecil, which has already been compiled by Vortex using either C or assembly code-generation. The first step is to build an instrumented executable from the Vortex-generated files by typing make pic at the Unix prompt (if you have the pm utility for spawning parallel C compiles, then you can give mc the -pic flag). This will produce an executable named my_program.pic. To gather the profile data, run the instrumented executable with the additional command line argument --picstats on a representative input. The profile data is printed to stdout when the program terminates normal execution, and you need to capture the profile data into a file for later processing, so we typically redirect the program output to a pipe or file. For example,

(Unix%) my_program.pic --picstats [other arguments] > my_program.data
The raw profile data must be processed before it can be utilized by Vortex, and a script called call-chain.perl has been provided to do this. For example,

(Unix%) call-chain.perl < my_program.data > my_program.nCCP
will format the profile data gathered in the previous step into a profile file called my_program.nCCP that can be utilized by Vortex. Finally, we read the profile data into Vortex by saying:

Cecil> load_profile "my_program.nCCP"
Once profile data has been read into the compiler, it will become part of the persistent program database and will be utilized during all subsequent optimizing compilations unless explicitly flushed.

For the best results, one should iterate this process a couple of times (gather profiles, use them to build an optimized executable, gather new profiles from the optimized executable, and so on), because the call chain context associated with the profile data increases with iteration, thus making the data more useful for optimization. After a few iterations, there should be no more changes in the profile data; a diff of the my_program.nCCP generated files should indicate when the best profile data has been achieved. Our experience has been that profiles derived from optimized executables are much more effective than those from non-optimized executables, and that several repeated iterations can increase performance by 10-20%, depending on the application.

Specialization is another optimization that relies on the presence of profile data and must be invoked explicitly. After loading profile data in to Vortex, typing graphs "my_program.cecil"; specialize at the Cecil> prompt will invoke profile-guided method specialization. This optimization increases performance by around 10-15%, again depending on the application. Unfortunately, a specialized application is not suitable for profiling itself, so save specialization for last, once profile iteration has been completed. (In the future, we will try to make specialization better integrated into the rest of the compiler infrastructure.)

Currently specialization and static class prediction are only implemented for Cecil and Java programs.


[1] See the Vortex user manual for a description of the various compiler options and how to set them. A list of all the compiler options, their current values, and a brief description of each option can be obtained by typing options all at the Cecil> prompt.
Performance Evaluation of Vortex-Compiled Applications - 25 MARCH 1997
[Next] [Previous] [Top]

Generated with Harlequin WebMaker