CSE341 Notes for Monday, 5/6/24

I spent a little time mentioning specific Racket/Scheme functions that might be helpful for the homework. I reminded people that last week's section included a Scheme overview and there is a nice language summary linked from the class web page in notes written by Alan Borning, who used to teach cse341 that would also be helpful to review while working on the homework.

I first showed a function called member that can be used to find out whether a particular value is in a list:

        > (member 3 '(7 8 9 10))
        #f
It doesn't follow the usual Scheme convention of having a ? in its name because it is a predicate. That is because it serves double duty. It can be used as a predicate, but it also gives the functionality of a "find" function, returning the part of the list that begins with the value you are searching for:

        > (member 8 '(7 8 9 10))
        '(8 9 10)
There is a convention in Scheme that anything other than #f is considered to be true, so you don't need to return #t to indicate that something is true.

I said that the append function is equivalent to the OCaml @ operator and is actually more flexible because it takes an indefinite number of arguments to append together:

        > (append '(3 4 5) '(10 20 30) '(a b c) (list (range 1 4)))
        '(3 4 5 10 20 30 a b c (1 2 3))
I reminded people that you can use the list function to form a list and the length function to ask for the length of a list:

        > (define a (list 3 4 5 'a))
        > a
        '(3 4 5 a)
        > (length a)
        4
And I mentioned that for the homework we will be using a function called random that returns a random real value n with 0 <= n < 1:

        > (random)
        0.9876574623474741
        > (random)
        0.06153977378277876
        > (random)
        0.7587938671533774
If you provide an integer parameter n to the call on random, it will produce an integer between 0 and n-1:

        > (random 10)
        8
        > (random 10)
        2
        > (random 10)
        5
We had already discussed functions for computing logical and and logical or and we discussed the fact that they have a "short circuit" behavior where they stop evaluating arguments as soon as the answer is clear (#f for and and #t for or).

I then switched to an example that involved finding factors of 24. We started with this test data and a function that computed whether a number is a factor of 24:

(define test1 '(2 1 -1 4 -2 6 12)) (define test2 '(4 12 -6 -2 18 -12)) (define test3 (append test1 (list 0))) (define (factor24 n) (= (modulo 24 n) 0)) I asked how we could write a function to test whether all values in a list are a factor of 24. We can write this using standard list recursion:

        (define (all-factor24 lst)
          (if (null? lst)
              #t
              (and (factor24 (car lst)) (all-factor24 (cdr lst)))))
When we asked the interpreter to run this function for our three test, we saw this result:

        > (all-factor24 test1)
        #t
        > (all-factor24 test2)
        #f
        > (all-factor24 test3)
        . . modulo: division by zero
This is the behavior we would expect. The first list has all factors of 24, the second includes the value 18 which is not a factor of 24, and the third list includes a 0, which causes an error. We could fix the factor24 function to do the right thing for 0, but I said I wanted to explore a different idea. What if we defined the third list with a value in front of the zero that is not a factor of 24:

        (define test3 (append test1 (list 10 0)))
When we ran this version, our function correctly return #f. Once it encountered the value 10, it had a #f result and our recursive function stopped making recursive calls because of the short-circuit property of the and function. That's good.

I then asked if we could do something similar with a higher-order function. Someone suggested using a folding operation and we wrote this code:

        (define (all-factor24 lst)
          (foldl (lamda (num bool) (and (factor24 num) bool)) #t list))
When we used this version, we again got a division by 0. Someone pointed out that we could do better by reversing the order of our two parameters in the call to and:

        (define (all-factor24 lst)
          (foldl (lamda (num bool) (and bool (factor24 num))) #t list))
That prevented the division by 0, which is great. But it still continues processing values even after it has determined the result, which can be inefficient, particularly for large lists.

Scheme offers an alternative with a function called andmap. It solves exactly the problem we're exploring here. It applies a function to each value in a list and computes the overall logical and, but it stops processing once it encounters a value of #f, as in:

        > (andmap factor24 test3)
        #f
There is a similar function called ormap that stops evaluating list elements once it encounters one that evaluates to #t (or equivalent).

Then I switched topics to point out an important difference between Scheme and OCaml. I typed the following into Scheme and got the expected result:

        > (define x 4)
        > (define (f n) (+ x n))
        > (f 6)
        10
This function definition is making reference to a free variable x, which picks up the value 4 from the top-level environment. I then asked people what would happen if we were to redefine x. In OCaml, we found that the function was unchanged. But that wasn't the case in Scheme:

        > (define x 12)
        > (f 6)
        18
So in OCaml, the closure that is formed for a function keeps track of the bindings that existed when the function was defined. Some people wondered whether Scheme is using dynamic scope, but that's not the case. In fact, Scheme was the first Lisp dialect to use lexical scoping. So when Scheme forms a closure for a function, it keeps a reference to the scope in which this is defined (the top-level environment), but it doesn't keep track of the state of that set of bindings. So if we rebind x to a different value, Scheme will use the new value rather than the old value.

Someone asked whether this works with elements that haven't even been defined. The answer is yes. For example, we can define a function in terms of a variable and a function that haven't been defined:

        > (define (g n) (+ x y n (h n)))
This defines a function g in terms of a parameter n, a free variable x that is currently defined in the top-level environment, a free variable y that hasn't been defined yet, and a function h that hasn't been defined yet.

Of course, we can't call g without generating an error until we've defined both y and h:

        > (g 6)
        y: undefined
        > (define y 7)
        > (g 6)
        h: undefined
        > (define (h n) (* n 3))
        > (g 6)
        43
Because Scheme is designed this way, it is easier to define mutually recursive functions and you have more flexibility about the order you can use for defining functions and variables. Of course, the cost is that we get less predictable behavior because we might redefine variables or functions, but it simplifies the syntax of the language. In OCaml, for example, we had a special "and" construct for defining mutually recursive functions. Scheme does not need such a construct.

I briefly mentioned that you can redefine global values using a mutating function called set! (pronounced "set bang"):

        > (define x 12)
        > x
        12
        > (define (reset n) (set! x n))
        > (reset 10)
        > x
        10
We want to be careful about our use of mutating functions because they make our code less predictable and more challenging to analyze. Scheme is very clear to use the exclamation mark (bang) as part of the name for mutating functions.

I then spent some time discussing how to implement a repl (read/eval/print loop). In Scheme, the distinction between data and code is very loose. We know that we use list notation to tell the interpreter to call built-in functions like +:

        > (+ 3 4)
        7
But that same list can be stored as data, as in:

        > (define a '(+ 3 4))
        > a
        '(+ 3 4)
The symbol a stores a reference to a list that could be thought of as storing data, but could also be thought of as code that can be executed. In Scheme, you can execute such code by evaluating it with the function eval:

        > (eval a)
        7
Similarly, we can set a variable to refer to the symbol +:
        > (define b '+)
        > b
        '+
And if we ever want to turn the symbol + into the function +, we eval it:

        > (eval b)
        #<procedure:+>
I then asked people to think about how the top-level read-eval-print loop is written. I first wrote it this way:

        (define (repl)
          (display "expression? ")
          (let ([exp (read)])
            (display exp)
            (display " --> ")
            (display exp)
            (newline)
            (repl)))
This sets up a continuous loop that prompts, then reads an expression, then echos the expression. It behaved like this:

        > (repl)
        expression? 2.8
        2.8 --> 2.8
        expression? (+ 2 2)
        (+ 2 2) --> (+ 2 2)
Of course, the actual read-eval-print loop evaluates expressions like (+ 2 2). It was easy to change our code to do this by calling eval on the expression:

        (define (repl)
          (display "expression? ")
          (let ((exp (read)))
            (display exp)
            (display " --> ")
            (display (eval exp))
            (newline)
            (repl)))
This had the usual behavior of the read-eval-print loop:

        > (repl)
        expression? 3.4
        3.4 --> 3.4
        expression? (+ 2 2)
        (+ 2 2) --> 4
        expression? (* 3.4 (+ 7 9) (- 18 4))
        (* 3.4 (+ 7 9) (- 18 4)) --> 761.6
There is a similar function called apply that takes two arguments: a function and a list of arguments, as in:

        > (apply + '(3 8 14.5))
        25.5
We don't have this kind of capability in Java or OCaml. For example, it might be nice to say something like this in Java:

        String s = "System.out.println(48);";
        execute(s);
Java doesn't have any such capability. It is much more difficult in a statically typed language like Java or OCaml to dynamically execute code like this at runtime. We'd have to somehow invoke the compiler to check the types of values mentioned in the expression. But in a language like Scheme that is designed to do its type checking at runtime, it is much easier to allow this.


Stuart Reges
Last modified: Mon May 6 14:36:07 PDT 2024