----------------------------------------------------------------------

CSE 401 Compilers Class

[Home] [Admin] [Details] [Help] [Other]

Details / Project Description

----------------------------------------------------------------------

You will make the following modifications to the PL/0 language, by modifying the original PL/0 compiler. This will happen in stages: lexing, parsing, semantic analysis, and one or more stages at the back-end. Start with the original lexical and BNF description.
  • Extend the PL/0 language to allow comments. A comment begins with a pound sign ("#") and extends to the end of the line. Comments are treated as whitespace (not tokens). The following code illustrates some comments:
    i := x + y; # this is a comment
    j := a - b; # this computes "a + (-b)"
    # a comment; still in the comment # and more comment
    

  • Extend the PL/0 language to allows underscores ("_") to appear anywhere that a letter can appear in an identifier. Underscores are significant. For example, the following are all distinct, legal identifiers:
    foo1 _foo1 f_o__o1 foo_1__ _ __ ___1
    

  • Replace the if statement with an if-then-else statement with the following syntax: <if stmt> ::= if <test> then <stmt list-1> [else <stmt list-2>] end The semantics are as expected: if the <test> evaluates to true, execute the statements in <stmt list-1>; otherwise, execute the statements in <stmt list-2>, if any. (Note: the -1 and -2 do not represent new syntactic entitites, but rather distinguish two entities that have the same syntax.)

  • Add a for loop to the language by augmenting the <stmt> production and by adding the production: <for stmt> ::= for <id> := <expr-1> to <expr-2> do <stmt list> end

    The semantics of the for statement are identical to those of the following while statement:

    <lvalue> := <expr-1> while <lvalue> <= <expr-2> do <stmt list> <lvalue> := <lvalue> + 1; end Note that this means that expr-1 is evaluated only once before entering the loop, and expr-2 is evaluated at each iteration. Also, the loop variable must be declared as an integer variable. It is legal for the initial expression to be greater than the final expression; in this case, the body of the loop is never executed. Assignments to the loop index variable within the loop body ***are*** allowed.

  • The extended language supports the break statement. The break statement has the effect of terminating the immediately enclosing while or for loop. It is an error for a break statement to appear not nested inside a loop.

  • Add first-class boolean types to the language. This means that the PL/0 programmer can declare variables to be booleans, using them in all the ways in which integers (the only previously available first-class type) can be used, except that only integers are allowed in input and output statements. A type, bool, is supported as the type of these variables.

    Two new boolean constants are supported, true and false. In addition, two boolean operators are introduced, and and or. Both operators take boolean expressions as arguments and return a boolean value. Both are short-circuiting, meaning that the second (rightmost) argument is not evaluated if the result of the first argument fully determines the result. They are both left-associative, and and has higher precedence than or; but and has lower precedence that all other operators. Example:

    if a = 0 or x / a > 3 and b then  end;
    # test parses as "(a = 0) or (((x / a) > 3) and b)"
    # if a = 0, then don't evaluate the rest of the test
    

  • The extended language supports arrays. The type of an array is described as: array [ <const expr> ] of <type> where the constant expression is constrained to be a positive integer, indicating the size of the array for the given type. The type of the elements can be any type, including another array. For example, the following are some legal array variable declarations: var a:array [10] of int; var b:array [25] of array [100] of bool; const size: int = 5 * 100; var c:array[size] of array[size + 1] of int;

    In the extended language, the following notation references an array element:

    <lvalue> [ <expr> ] where the <lvalue> production has been modified to: <lvalue> ::= IDENT | <lvalue> [ <expr> ] The <lvalue> should evaluate to some array a, of size n, and the index expression should evaluate to some interger i. This expression returns the ith element of the array a, where the first element of the array is at index 0. It is an unchecked run-time error for i to be out of the range [0..n). The <lvalue> may be another array indexing operation, in the case of accessing a multi-dimensional array. Assignment of whole arrays or array slices is disallowed. Use of whole arrays or array slices in input and output statements is also forbidden. Examples:
    var a:array [5] of array [3] of int;
    var b: array [3] of int;
    a[1][i+j] := b[k-1] * 2;
    a[1] := b;      # This is illegal
    b := input;     # So is this
    
  • In the extended language, a formal parameter can be annoated as being passed by reference by prefixing it with the var keyword (much as in Pascal). The following is an example of such a procedure:
    procedure foo(var x: int, var y: int, var z: array [10] of int);
    begin
        x := x+y;
        y := y+x;
        x[5] := x*y;
    end foo;
    
    The caller of a procedure containing a call-by-reference formal parameter must use an <lvalue> as the corresponding actual parameter to the procedure; other expressions as actuals to a call-by-reference formal are illegal. Examples:
    var a: int, b : array [10] of int;
    const c: int = 5;
    var d: array [5] of int;
    
    foo(a,17,b);            #legal
    foo(b[5],a+b[7],b);     #legal
    foo(c,17,b);            #illegal
    foo(17,a,b);            #illegal
    foo(a+b[7],20,b);       #illegal 
    foo(a,17,d);            #illegal
    
    Any formal can be assigned within the body of a procedure. However, only if the formal is call-by-reference will the assignment affect the caller's variables.

    Arguments of scalar types can be passed by either value or reference. Arrays must be passed by reference, and as a result var must precede arrays which are declared as formal parameters. The size of the array being passed must match the type of the array formal.

  • A procedure may optionally return a result; such a value-returning procedure is called a function. To indicate that a procedure returns a result, a result type declaration is added after the list of formal parameters in the procedure declaration. (This is optional, since procedures are still supported as well as functions.)

    The following are legal function declarations:

    procedure foo(x:int, y:bool) : int;
    begin
       
    end foo;
     
    procedure bar():bool;
    var x:int;
    begin
       
    end bar;
    
    Function calls can be used anywhere that an expression of the appropriate type can be used; syntactically, function call expression shave higher precedence than other operators. Functions may also be called like procedures, as a separate statement. For example, given the above function declarations, the following is a legal statement:
    foo(foo(3,bar()), i < foo(x+y,bar()));
    
    The type of the result of a function must be a scalar type; whole arrays cannot be returned. The result is returned by value.

  • The return statement can be used to exit a procedure, immediately transferring control to the end of the procedure body. It is legal for a procedure to return by "falling off" the procedure body without encountering an explicit return statement.

    To return a value from a function, the return statement is used with an expression. Both forms are captured by

    <return stmt> ::= return [ <expr> ] The type of the expression returned must match the type declared for the enclosing function. It is illegal for a value-returning return statement to appear in a procedure not declared to return a result. it is similarly illegal for a non-value-returning return statement to appear in a function. It is an error if it is possible for execution to reach the end of a function's body without executing a return statement. In particular, a function whose only return statement is in a then clause of an if statement is invalid, since if the then clause is not taken the function will not return anything.

    The following illustrates some legal uses of return statements:

    procedure foo(x:int,y:bool):int;
    begin
       if y then return x; end;
       return x * 5;
    end foo;
    
  • ----------------------------------------------------------------------

    401admin at cs.washington.edu (Last modified: 11/19/96)

    Netscape-HTML Checked!