A.Shaw "

Chapter 4 ASSERTIONAL SPECIFICATIONS

Assertions are used to describe directly the required properties of a system; this is in contrast to state machines which generate behaviors. Properties are specified in terms of system events, time, or the values of state variables. The properties are asserted to hold at various times during systems execution, such as for all time, eventually, before or after a given time, within a given time interval, or before or after some function is performed; assertions for the latter are called pre- and post-conditions, respectively. Applications of assertions include documentation, informal reasoning, and verification of requirements, designs, and implementations; as well as debugging and monitoring of executable specifications and implementations.

This chapter discusses three different and useful classes of assertional methods: regular expressions and grammars, and their extensions, which have a straightforward and formal relationship with state machines; various formal logics - propositional, predicate, and temporal - that deal generally with the truth values of expressions; and a particular logic, real-time logic, that explicitly includes time and events.

4.1 Regular Expressions and Extensions

There is a natural formal correspondence between machines or automata, on the one hand, and formal languages, on the other [Hopcroft&Ullman79]. For each of various well-defined classes of machines, there exist an equivalent class of grammars such that the symbol or event sequences accepted by the machines are identical to those in the languages generated by that class of grammar. Finite state machines correspond to regular grammars in this classification, and regular grammars can be represented elegantly as regular expressions (REs). Because of their straightforward relationship with state machines and because they describe sequences directly, REs and their extensions are useful notations for specifying behavioral properties.

We review the definition of REs using the fsm for the Train Crossing Gate presented in Figure 3.5 in the last chapter. Assume that the state Closed is both the start and halt state of the machine. One set of paths through the states goes from Closed to Opening, then back and forth through Closing and Opening, and finally through Closing to Closed. Ignoring the self loops at these states, the possible sequences of events tracing these paths are the set:

{ og cg c-c, og cg og cg c-c, og cg og cg og cg c-c, . . . } .

This unbounded set can be represented by the RE

og cg (og cg)* c-c ,

where the notation a* means zero or more instances of sequences or strings taken from a. Each self loop can be characterized as a "starred" expression, for example, og* at state Open.

Not including the self loops, the complete RE G for the fsm is:

G = og (cg og)* ( cg È ( (o-o cg og)* o-o cg (og cg)* ) ) c-c .

The "union" symbol È means a selection of a string or sequence from one of its arguments. G represents all event sequences accepted by the Gate, except those produced by self loops.

Generally, an RE S is formed from a given set S of basic symbols and denotes a regular set L(S), defined more formally as follows:

1. a) For any x in S, x is an RE. L(x) = {x} .

b) Æ and l are REs. L(Æ) = the empty set. L(l) = {l}, where l is the empty string.

2. Let S, S1, and S2 be REs.

a) The concatenation S1S2 is an RE.

L(S1S2) = { z=xy: x in L(S1) and y in L(S2) } .

b) The union S1 È S2 is an RE.

L(S1 È S2)= { z in either L(S1) or L(S2) } = L(S1) È L(S2) .

c) The Kleene closure S* is an RE.

L(S*) = È S i = {l} È L{S} È L(SS) È L(SSS) È . . .

The RE definitions can often be used directly to formally manipulate expressions in order to prove various properties of event sequences, such as set membership or set inclusion.

Analysis Example: The Train Crossing Gate

The RE G above for the Gate has the form:

G = A ( B È C) D ,

where A = og (cg og)* , B = cg , C = ( (o-o cg og)* o-o cg (og cg)* ) , and D = c-c.

The "normal" event sequence s starting from the closed state is

s = og o-o cg c-c .

We want to show that s in L(G). This can be done by first simplifying G. G is equivalent to the RE

G' = ABD È ACD ,

in the sense that L(G) = L(G'). The manipulation to G' works because RE concatenation distributes over union. If s in L(G), then it must also be the case that s in L(G'). We now argue that s in L(ACD) and hence in L(G'). The desired string from L(ACD) is obtained simply by selecting the empty string from each of the starred terms in A and C.

A convenient way to describe event sequences that may occur in parallel is through an interleaving model. For example, the sequence a b in parallel with the sequence c d can be represented by the set of all possible interleaves or shuffles of the two sequences:

{ a b c d, a c d b, a c b d, c d a b, c a b d, c a d b} .

REs are extended with a shuffle operator, denoted Å, so that the set above is represented by the expression: ( a b ) Å ( c d ) . Expressions with this operator and additional ones to be mentioned are called flow expressions (FEs) [Shaw78, Chi&Shaw86]. In general, a Å b will mean the set of strings or sequences obtained by performing all possible interleaves of an arbitrary string from a with one from b, where a and b are FEs.

Traffic Light Example

Suppose that there are two pairs of traffic lights at the intersection of a street and an avenue, one pair to control the street traffic and one for the avenue. Each pair runs through a green-yellow-red sequence cyclically. The unrestricted operations of the pairs can be specified by the FE:

( rs ( gs ys rs )* ) Å ( ra ( ga ya ra )* ) ,

where the g, y, and r refer to the colors green, yellow, and red, respectively, s denotes street, a means avenue, and each symbol designates the event of turning the light to the indicated color. It is assumed that the first event for each light after "power-on" turns the light red, hence the initial rs and ra.

In order for this system to be safe, the pairs of lights should never be green or yellow at the same time - when one pair is green or yellow, the other pair must be red. Thus the set of interleaves of events must be restricted so that the unsafe cases are eliminated. Globally, the system must then satisfy the constraint or property given by the following simple FE:

( rs Å ra ) ( ( gs ys rs ) È ( ga ya ra ) )* .

The shuffle operator doesn't make the notation more expressive theoretically. It does allow more convenient and less complex descriptions. More power, formally and practically, is obtained if the closure of the Å operator is added. The shuffle closure operator, denoted Ä, is the interleave equivalent of the * operator. The notation aÄ will represent zero or more interleaves of strings from a, i.e.

l È a È (a Å a ) È (a Åa Åa ) È . . . .

The shuffle operator and its closure were first defined and applied to message sequences in [Riddle72].

Gate Controller Example

Consider again the extended fsm for the Gate Controller presented in the last chapter (Figure 3.6). Let cg, og, te, and tl be the events Close_Gate, Open_Gate, Train_Entering, and Train_Leaving, respectively. Assume that Area _Empty is both the start and halt state for the machine. Following is an FE that specifies exactly the possible sequences of events recognized by this machine:

( ( te cg ( te tl )Ä tl ) og )* .

Every string s from (te tl)Ä has the desirable characteristic that in any initial substring of s, the number of te's is greater than or equal to the number of tl's. It should be evident from the FE that the open gate command og is issued only after the last train has left and the area is empty. This property is guaranteed in the fsm by using the count variable.

Discrete time can be included in REs or FEs in a straightforward way by assuming the existence of a clock that generates tick events periodically. Let t be the symbol for a tick event. Then, two events a and b that are separated by n ticks can be represented by a tn b . (The expression actually specifies an interval between a and b bounded by (n-1) and (n+1) ticks.)

Mouse Clicker

Recall the mouse clicker example described in Section 3.5.3 and Figure 3.19. The user M's events with time included are represented by the RE

( tx D ty U )* ,

where x and y are in the time intervals [a, a+d] and [b, b+d], respectively. The recognizer R can be specified as the union of four REs, one corresponding to each path through the CRSM. The simple selection path is defined by the RE

t* D ts t* SS t* U t* SE

with s = tSC. A simple single click has an associated RE

D t<s U td SC ,

where t<s means less than s events of type t in sequence and d = tDC. These REs clearly describe the desired properties correctly.

The flow expression notation also has some operators, similar to locks and binary semaphores, that permit the specification of arbitrary restrictions on interleaving of symbols. In fact, the complete notation is computationally universal. A survey of various extensions of regular expressions, such as path expressions, event expressions, and message transfer expressions, and their applications in non-real-time software appears in [Shaw80]. A similar notation called constrained expressions has also been used for representing and analyzing the behavior of distributed systems [Dillon_et_al88] and real-time systems.

4.2 Propositional, Predicate, and Temporal Logics

Logics are formal notations that deal with the truth or falsity of statements and relations. They provide precise methods for specifying, checking, manipulating, and verifying properties and behaviors. In this section, we briefly review and evaluate three mainstream logics. The train crossing gate application is employed as an example to illustrate each language.

The simplest logic is the propositional calculus. It defines a system for expressing and reasoning about the truth values of statements or propositions that are composed of basic objects and logical connectives. The basic objects are the constants T and F, for true and false respectively, and a set of variables that can take on the values T or F. Such a variable or constant is often called propositional or Boolean. The standard logical connectives include Ù, Ú, Ø, and Þ, meaning and, or, not, and implication, respectively. Recall, for example, that aÞb is equivalent to ØaÚb , where a and b are propositional variables or constants.

Let Trains_In and Gate_Closed be Boolean variables whose truth values denote whether or not there are trains in the crossing area and if the gate is closed, respectively. Then, the global requirement for a gate system can be expressed:

( Trains_In Þ Gate_Closed ) Ù ( ØTrains_In Þ ØGate_Closed ) .

This can be manipulated into the equivalent expression:

( Trains_In Ù Gate_Closed ) Ú ( ØTrains_In Ù ØGate_Closed ).

The first term is a safety assertion - if there are trains in the area, the gate must be closed. The second term is a statement about the progress of the system, assuming that the gate is open when the condition ØGate_Closed is true. Note that if the gate is always closed, the system is guaranteed to be ÒsafeÓ, but no cars will ever cross the track, i.e., there will be no progress for this part of the system.

The statements can be interpreted as conditions that change truth values as a result of state changes in the system; the state changes are triggered by events. Alternatively, any state S can be represented by an expression E such that E is true whenever the system is in state S and false otherwise. However, one serious limitation of propositional logic is that it doesn't provide any more detailed way to say what it means to be in a given state. For example, the variable or state Trains_In could be true or false, but it may be more useful to describe itÕs meaning more precisely.

Predicate logic adds such facilities in a complete, if not always convenient, fashion. This more powerful logic allows predicates which are assertions containing variables and functions that may range over general domains such as integers, strings, or sets. There are also universal and existential quantifiers, denoted " ("for all") and $ ("there exist(s)"), respectively.

Let the variable count, ranging over the non-negative integers, contain the number of trains in the crossing area (as in Figure 3.6). Then the proposition Trains_In can be refined to the predicate count>0. A predicate logic expression of the gate requirement is

"count ( ( count>0 Þ Gate_Closed ) Ù ( count=0 Þ ØGate_Closed ) ) .

A major limitation is that the logic has no convenient way to express ordering. For example, the intent in the requirement is that ØGate_Closed be true (the gate is to be open) sometime after count becomes zero.

Temporal logic solves this problem [Manna&Pnueli93]. Despite its name, temporal logic does not deal with time, but only with the ordering of events and states in a system. The operation of a system from a given state is characterized by sequences of states, each sequence representing a possible execution or path through the system starting from the given state. Assertions on properties of the sequences and states can be made with the predicate calculus plus some new temporal operators.

The most widely used operators are the unary operators [] and ×, which have the meaning:

[]A : The assertion A is true for all future states.

×A : The assertion A is eventually true.

The meaning is interpreted relative to some given current state or "time". Thus, the assertion ×A at some state S specifies that every sequence of reachable states starting from S will eventually contain a state where A is true. Another useful operator is the binary until operator, denoted U, which is defined for the assertions A and B:

A U B : A holds (at least) until the first occurrence of B

and B will eventually occur .

Using these operators, we express some of the requirements for the train crossing gate. A, perhaps natural, safety assertion is

[] ( count>0 Þ × Gate_Closed ) .

This assertion could be relative to the start state or any intermediate state. It states that every state with count>0 coincides with or is followed eventually by a state such that Gate_Closed is true. However, the assertion is flawed - there may be any number of intermediate states between the time that count>0 and Gate_Closed where either Ø Gate_Closed or count=0 holds. A more satisfying statement is

[] ( count>0 U Gate_Closed) Ù (count=0 U ØGate_Closed ) .

Other operators, such as Next-State and Previous-State, are also defined. Temporal logic is a very powerful extension of conventional logic, and has proven quite effective for reasoning about concurrent programs. However, it does not specify time or describe events directly. For example, it would be desirable to specify the maximum amount of time permitted to close the gate, i.e., a deadline, once a train enters the area. Some extensions with time intervals and absolute time have been proposed, but none seem as useful as the real-time logic presented in the next section.

4.3 Real-Time Logic

Real-time logic (RTL) was invented by Jahanian and Mok in the 1980's (e.g. [Jahanian&Mok86]). RTL was designed specifically as a logic for reasoning about the timing properties of real-time systems. This was accomplished through a very simple but clever addition to predicate logic - a function for describing events and their occurrence times.

This function, denoted @, is called the occurrence function. @ maps events and non-negative integers into non-negative integers, with the interpretation:

@(E, i) yields the time of the ith occurrence of event E.

Here, an event name E designates a class of events whose occurrences are ordered both by instance (i) and by time (@); instances of events in the same class cannot occur simultaneously. Time is discrete and is translated to the positive integers 1, 2, 3, ... .

With the @ function, one can talk about the time of the 10th depression of button a, @(a,10), or the time of the 43rd clock tick, say @(tick,43). Thus, if the 0th instance of event E occurs at time 16, the 1st at time 23, the 2nd at time 24, and no further instances of E have occurred, we have @(E,0)=16, @(E,1)=23, @(E,2)=24, and @(E,i) undefined for i>2. It is also convenient to define initiation and termination events for actions or computations. The start of an action of type A will correspond to an event class A ; the completion of A will correspond to an event ¯A .

Examples

1. Suppose that upon pressing Button a, a computation Test must be completed within 30 seconds. This requirement can be expressed in RTL as

"x (@(a,x) < @(Test,x)) Ù (@(¯Test,x) £ @(a,x) + 30) .

Note that the a and Test events are matched; the xth a has a corresponding xth Test.

2. Two actions A1 and A2 are to be mutually exclusive; i.e., during the execution of A1, A2 cannot be executed, and vice versa. This standard constraint can be stated in RTL using the start and end events of the actions:

"i,j ( ( @(¯A1,i) < @(A2,j) ) Ú ( @(¯A2,j) < @(A1,i) ) ) .

3. Consider a standard sporadic timing constraint [Jahanian&Mok86]: On event E with minimum event separation p, execute A with deadline d. An RTL specification of these constraints is

"i $j ( (@(E,i) £ @(A,j)) Ù (@(¯A,j) £ (@(E,i) + d) ) )

Ù "i ( (@(E,i) + p) £ @(E,i+1) ) .

In this specification, other instances of A are possible. It is required only that there exist an instance, a "jth", associated with each E. Note also that a unique A does not necessarily correspond to each E; the specification is satisfied, for example, if two successive EÕs are followed by a single associated A and d is sufficiently large. This problem cannot occur if d £ p .

4. Let the train crossing gate (Figure 3.5) operate so that the gate will close within 5 time units of receiving a close command, provided that an intervening open command has not been issued. The transition events are classes in the RTL sense. The constraint may be stated precisely in terms of these events:

If a cg command is issued at some time t1, then either

a) the gate will close (c-c event) at some time t2 £ t1 +5, and there will have been no og command issued between t1 and t2, or

b) the gate receives an og command at some time t2 £ t1 +5 and the gate will not close (no c-c event) at any time between t1 and t2.

This can be translated in a straightforward fashion to RTL. (See Exercises.)

5. Using the gate example again, let in be the event associated with a train entering the area and out be the event occurring when a train leaves. The following RTL expression gives the conditions for at least one train in the area between times t1 and t2:

"t ( (t1 £ t £ t2) Þ $i Ø$j ( (@(in,i) £ t) Ù (@(out,j) £ t) Ù (j=i ) ) )

The assertion states that between t1 and t2 there always exists at least one previous train entering event without a subsequent matching exit.

RTL has been used by its creators for some manual and automatic theorem proving of properties of specifications; in their methodology, they translate modecharts, a state-based scheme similar to statecharts, into RTL formulas for subsequent analysis. RTL also been applied to proving the correctness of computer programs, where the actions are program statements.

Some recent more practical applications include using RTL as the assertional language for run-time monitoring of timing constraints in distributed programs [Jahanian_et_al94] and for checking the behavior of CRSMs (Section 3.3.2) during simulations [Raju&Shaw94]. One variation of this type of monitoring and checking includes the program form:

when <event> { assert( <RTL_constraint>) } .

The semantics is that the <RTL_constraint> is checked whenever an instance of <event> occurs. If the check, fails an error exception is raised.

The constraint contains an expression in RTL without quantifiers and referring to the past history of events. It is most common to refer to the most recent event of a given class, the second most recent, and so on, rather than to the events starting from the first occurrence. The notation @(E, -i) , i>0, will denote the time of the ith most recent event of class E, where @(E,-1) returns the time of the most recent E event. Thus, if n instances of E have occurred with times @(E,1), . . . , @( E,n), then @(E,-i) = @(E,n-i+1) , (i>0).

Examples

1. Suppose that a traffic light controller at an intersection gives special service and priority to emergency vehicles, such as ambulances or police cars. When such a vehicle approaches the intersection, the controller switches all lights to red with enough time so that the area can be cleared safely before the emergency vehicle arrives. Let the vehicle send two types of messages (events) to the controller - an approaching event indicating that the vehicle will be crossing the intersection shortly, and a before event when the vehicle is just at the intersection. There must be at least 5 seconds between these events, in order for the controller to switch the lights and for the intersection to clear. To check that the system adheres to this constraint, the following statement can be used:

when before { assert( @(before,-1) ³ @(approaching,-1) + 5 ) }

Each time that an instance of the event before occurs, the assertion is checked.

2. Recall the Mouse Clicker example presented in Section 3.5.3. A DC message (event) is emitted every time that a double click is recognized. The correctness of an implementation or of an executable CRSM specification can be checked on each instance of DC with the following :

when DC

{ assert( (@(U,-1) - @(D,-1)) £ tSC Ù

(@(D,-1) - @(U,-2)) £ tDC Ù

(@(U,-2) - @(D,-2)) £ tSC } .

A machine M accepts or recognizes a sequence or string s if input of s to M causes M to terminate in a halt state.

t <s is an abbreviation for ( l È t È t2 . . . È ts-1 ) .

One particular subtlety about the @ function is ignored here and in subsequent examples. Namely, it is only a partial function, defined possibly only for a finite number of instances of the event. Thus, in this example, one should really specify that for each (a,x) that is defined (rather than for all x), there exist corresponding (Test,x) and (¯Test,x) events that meet the constraints. This can be expressed as follows:

"x"t1 ( @(a,x)=t1 Þ $t2,t3 ( @(Test,x)=t2 Ù @(¯Test,x)=t3 Ù t1<t2 Ù t3£t1+30 ) )

Chapter 4