CSE503: Software Engineering
Lecture 20  (February 24, 1999)

David Notkin

v     Slicing

Ø      Program slicing is an approach to selecting semantically related statements from a program [Weiser]

Ø      In particular, a slice of a program with respect to a program point is a projection of the program that includes only the parts of the program that might affect the values of the variables used at that point

Ø      The slice consists of a set of statements that are usually not contiguous

v     Basic ideas

Ø      If you need to perform a software engineering task, selecting a slice will reduce the size of the code base that you need to consider

Ø      Debugging was the first task considered

Ø      Weiser even performed some basic user studies

Ø      Claims have been made about how slicing might aid program understanding, maintenance, testing, differencing, specialization, reuse and merging

v     Example

read(n)
i := 1;
sum := 0;
product := 1;
while i <= n do begin
    sum := sum + i;
    product := product * i;
    i := i + 1;
end;
write(sum);
write(product);


read(n)
i := 1;

product := 1;
while i <= n do begin
    product := product * i;
    i := i + 1;
end;
write(product);

 

Ø      In this example, the second program is a slice of the first, slicing on the product variable at the write statement.

v     For Weiser, a slice was a reduced, executable program obtained by removing statements from a program

Ø      The new program had to share parts of the behavior of the original

Ø      Weiser computed slices using a dataflow algorithm, given a program point (criterion)

§        Using data flow and control dependences, iteratively add sets of relevant statements until a fixpoint is reached

v     Ottenstein & Ottenstein defined an alternative approach that clearly dominates

Ø      Build a program dependence graph (PDG) representing a program

Ø      Select node(s) that identify the slicing criterion

Ø      The slice for that criterion is the reachable nodes in the PDG

Ø      PDG for the example (at the bottom of the file)

§        Thick lines are control dependences

§        Thin lines are (data) flow dependences

v     True PDGs are somewhat more complicated than this

Ø      Vertices in the graph represent (a) assignment states and (b) predicates in the program

Ø      Edges represent control and data flow dependences

Ø      Control dependences always start at a predicate (or the entry node, a special node representing the beginning of the computation)

§        They are labeled with a boolean

§        Intuitively, node w is control dependent on node v if the predicate of node v evaluates to the label on the edge from v to w

·        That is, what happens at w controls whether or not v executes

§        An assignment statement followed immediately by another assignment statement have no control dependence between them, since the second one always executes when the first one does

·        This is roughly similar to the notion of a basic block

Ø      Data dependences represent the possible flow of values through the program

§        (Roughly) there is a data dependence (edge) from node v to node w if v includes an assignment to some variable x, and then w includes a use of (that specific) x.

·        These can be separated into (at least) loop-independent and loop-carried dependences, which roughly distinguish whether the relationship is across iterations of a loop or not

Ø      Def-order dependences can also be used; these aren't needed for all analyses, but ensure that only equivalent programs have isomorphic PDGs.

v     Procedures

Ø      What happens when you have procedures and still want to slice?

Ø      Weiser extended his dataflow algorithm to interprocedural slicing

Ø      The PDG approach also extends to procedures

Ø      But interprocedural PDGs are a bit hairy (Horwitz, Reps, Binkley used SDGs)

§        Representing conventional parameter passing is not straightforward

Ø      In the Reps/Horwitz paper is an example of an SDG

program
    P := 3.14;
    rad := 3;
    if DEBUG then rad :=4 fi;
    call Mult3(P,rad,rad,area);
    call Mult3(2,P,rad,circ);
    output(area);
    output(circ);
end;

procedure Mult3(op1,op2,op3,result)
    result := 0p1*op2*op3;
end;

v     Essentially a linked collection of PDGs (one per procedure), with linking vertices for calling protocols

Ø      Value-result parameter passing is used; it's much harder to handle by-reference mechanisms

Ø      If you slice an SDG-like structure using standard reachability, you can get a very large slice

§        Specifically, if you slice from inside a procedure, with reachability you trace back to all possible calls sites of the procedure; since calls and returns are represented by separate edges, it's roughly like taking a return to a call site that didn't make the call

§        There are a number of solutions to this, include the summary edges imposed in SDGs

v     Technical issues

Ø      How to slice in the face of unstructured control flow?

Ø      Must slices be executable?

Ø      What about slicing in the face of pointers?

Ø      What about those pesky preprocessor statements?

v     Dynamic slicing

Ø      These algorithms have all been static

Ø      They work for all possible inputs

Ø      There is also work in dynamic slicing, trying to find slices that satisfy some execution streams over sets of inputs

§        Korel and Laski characterize dynamic slices in terms of a trajectory that captures the execution history of a program in terms of a sequence of statements and control predicates

v     (Potential) applications of slicing

Ø      Debugging

Ø      Program differencing

Ø      Semantic versions of diff

Ø      Program integration

Ø      Merging versions together

Ø      Testing

§        Slicing can be used to define more rigorous testing criterion than a conventional data flow testing criterion

v     Recap

Ø      Cool idea

Ø      Cool supporting technology

Ø      Difficult on practical programs

Ø      May be coming closer to feasible after almost 20 years of research

Ø      Little data on the size of slices

Ø      Will it be more than a cool idea?

§        My guess?  No (but I wouldn’t bet the farm on it)