CSE413 Notes for Friday, 5/3/24

I spent a few minutes pointing out that to understand Scheme, it is helpful to understand the motivation of the people who designed it. Some people just want to be consumers of programming languages. Others are interested in understanding the details of how programming languages are implemented. The designers of Scheme are very interested in language implementation. On the MIT Scheme page the first sentence about the language talks about its simplicity:

It was designed to have an exceptionally clear and simple semantics and few different ways to form expressions.

but the second sentence emphasizes the ability to implement programming language constructs in Scheme:

A wide variety of programming paradigms, including imperative, functional, and message passing styles, find convenient expression in Scheme.

Structure and Interpretation of Computer Programs includes an extensive discussion of how to implement a metacircular interpreter in Scheme. This is a Scheme interpreter written in Scheme. That may sound like a useless thing to have ("I already have Scheme, so why build another version on top of it?"), but it can prove quite useful. Probably the most useful byproduct of writing a metacircular interpreter is that you learn a lot about how Scheme itself works. It also gives you the ability to create your own variations to the language. Programming languages like Java are not very flexible. You can't, for example, add new control structures to the language. That's not true in Scheme. The language gives you a lot of room to create your own language constructs. And with a metacircular interpreter, you can create slight variations of Scheme that behave just the way you want them to. This is one explanation for the fact that there are so many different versions (often called "flavors") of Scheme.

Then we discussed equality operators in Scheme. In OCaml the = operator was used to compare equality of many types of data, including structured data. In Scheme the = operator is used for numerical equality. So if you know that the two values you are comparing are numbers, then you should use the = operator.

If you aren't guaranteed to be comparing numbers, then you have a range of equality operators to choose from. The strictest of these is the eq? predicate. It does a pointer comparison (are these references to the exact same object?). This is similar to Java's == operator for objects. This can have surprising results for numbers. For example, we got the expected result when comparing integer values:

        > (define a 13)
        > (define b 13)
        > (eq? a b)
        #t

and when comparing floating point constants:

        > (define c 13.8)
        > (define d 13.8)
        > (eq? c d)
        #t

but not when comparing floating point numbers computed with an expression:

        > (define c (+ 1 12.8))
        > (define d (+ 1 12.8))
        > (eq? c d)
        #f

Similar issues come up in Java as well. If you execute this code:

	Integer i1 = 13;
        Integer i2 = 13;
        System.out.println(i1 == i2);
        Double d1 = 12.4;
        Double d2 = 12.4;
	System.out.println(d1 == d2);
	Integer i3 = 666;
	Integer i4 = 666;
	System.out.println(i3 == i4);
	Boolean b1 = true;
	Boolean b2 = true;
	System.out.println(b1 == b2);
	Boolean b3 = true;
	Boolean b4 = new Boolean(true);
	System.out.println(b3 == b4);

The output is:

        true
        false
        false
        true
        false

In Java, the floating point numbers are always stored as distinct objects, so when you ask whether they are "==" to each other (whether they are exactly the same object), the answer is no. In the case of integer values, Java sometimes uses separate objects and sometimes uses a cached version of the object. The language standard says that only a certain range of integes will be cached, which is why we get different results using 13 versus 666. Boolean values are generally cached unless you specifically ask Java for a new Boolean object.

This issue comes up with lists as well. Scheme will, in general, try to avoid making copies when it doesn't have to. For example, given these definitions:

        > (define x '(1 2 3))
        > (define y x)
        > (define z '(1 2 3))

Scheme will create two lists objects. The first list is pointed to by both x and y. The second is pointed to by z. So when we compare with the eq? predicate, we find that x and y are equal (are the same pointer) but not x and z:

        > (eq? x y)
        #t
        > (eq? x z)
        #f

The eqv? predicate is slightly stronger than the eq? predicate. It returns true in the same cases but also returns true for simple values like numbers. So with eqv, we see that the floating point comparisons now return true, but the list comparisons still do not:

        > (define a (+ 36.4 2))
        > (define b (+ 35.4 3))
        > (eqv? a b)
        #t
        > (define x '(1 2 3))
        > (define y x)
        > (define z '(1 2 3))
        > (eqv? x y)
        #t
        > (eqv? x z)
        #f

The third variation is the equal? predicate, which can be thought of as a deep equality operator. It recursively compares the values in each structure. So with the equal? predicate, even our list example works:

        > (equal? x y)
        #t
        > (equal? x z)
        #t

Then I spent a few minutes reviewing boolean operators. We wrote a predicate to determine if one number divides another:

        ; predicate to test whether m divides n
        (define (divides m n) (= (modulo n m) 0))

But this failed if we asked whether 0 divides a number. By definition, 0 does not divide anything, so we fixed this up with an if:

        (define (divides m n)
          (if (= m 0)
              #f
              (= (modulo n m) 0)))

I then discussed the fact that something interesting is going on here. Imagine, for example, that you tried to write your own version of if as a simple procedure:

        (define (if test val1 val2) ...)

Would it have the correct behavior? The answer is no. When you call a simple procedure, Scheme evaluates the values being passed as parameters to the procedure and then executes the procedure. This is known as eager evaluation.

Consider what would happen if if used eager evaluation. It would first evaluate the test and the two expressions. In the case of our divides predicate, it would produce the error by trying to compute a value modulo zero. In the case of a recursive definition, it is even worse. We count on the if being able to stop the recursive call from taking place when we reach a base case. If the if fully evaluated both arguments, we couldn't use it to write recursive definitions because they would produce infinite recursion.

In Scheme, the if is a special form that delays the evaluation of the two expressions that come after the test. It evaluates only one of these expressions depending upon what the test evaluates to. This is sometimes referred to as a form of lazy evaluation.

We decided that this definition of divides was violating boolean zen for Scheme. Scheme has procedures called "and", "or" and "not" that perform boolean operations. So we rewrote the divides predicate as:

        (define (divides m n)
          (and (not (= m 0)) (= (modulo n m) 0)))

Then we discussed higher-order functions in Scheme. They are very similar to what we saw in OCaml. There is a function called map that takes a function and a list as parameters and that returns the list obtained by applying the function to each value in the list:

        > (define (sqr n) (* n n))
        > (map sqr (range 1 20))
        '(1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400)

And there is a procedure called filter that takes a predicate and a list as argument and that returns the values from the list that satisfy the predicate:

        > (define (mult3? n) (divides 3 n))
        > (filter mult3? (range 1 20))
        '(3 6 9 12 15 18)

For OCaml we used a function called reduce, but for Scheme we will be using procedures called foldl and foldr. OCaml has the same functions with slightly different names (List.fold_left and List.fold_right). Recall that our reduce function took a 2-argument collapsing function like addition that takes two numbers and reduces them to one number. In the case of foldl and foldr there is an extra parameter which represents the initial value to use in the computation (this is also the value returned when the list is empty). For example, when we do a cumulative sum, we initialize the sum to 0 because 0 is the additive identity (won't affect the result of an addition calculation). For a cumulative product we use 1. For string concatenation we'd start with an empty string.

We used foldr to define the factorial function:

        (define (factorial n)
          (foldr * 1 (range 1 n)))

We tested this in the interpreter by calling map:

        > (map factorial (range 1 10))
        '(1 2 6 24 120 720 5040 40320 362880 3628800)

For many computations like addition and multiplication, it doesn't matter if you fold from the left or from the right. But it matters when you do something like constructing a list. The foldl version incorporates values from left to right in the computation. For example:

        > (foldl cons '() '(1 2 3))
        '(3 2 1)

We discussed how this is computed step by step as the values are incorporated from the left. First we include the 1 (the leftmost value):

        (cons 1 '())

Then we include the next value from the left, which is 2:

        (cons 2 (cons 1 '()))

Then we include the 3:

        (cons 3 (cons 2 (cons 1 '())))

When we used foldr, the list was built-up taking values from right to left:

        > (foldr cons '() '(1 2 3))
        '(1 2 3)

I gave another example that involved concatenating strings together either from the left or from the right:

        > (foldl string-append "" '("four" "score" "and" "seven" "years" "ago"))
        "agoyearssevenandscorefour"
        > (foldr string-append "" '("four" "score" "and" "seven" "years" "ago"))
        "fourscoreandsevenyearsago"

Then I mentioned that Scheme has anonymous functions just like OCaml. They are known as lambdas. John McCarthy first used this term all the way back in 1958 when he defined Lisp. It is a nod to the historical importance of Alonzo Church's lambda calculus, which provides the mathematical foundations for functional programming.

In OCaml we defined anonymous functions with this syntax:

fun <parameters> -> <expression> The syntax is similar in Scheme:

(lambda <parameters> <expression>) For example, you can define a lambda for doubling a number and map it over a list of numbers:

        > (map (lambda (n) (* 2 n)) (range 1 20))
        '(2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40)

Below is an example of an expression that finds all the factors of 24:

        > (filter (lambda (n) (divides n 24)) (range 0 100))
        '(1 2 3 4 6 8 12 24)

And this one finds the sum of the factors of 24:

        > (foldl + 0 (filter (lambda (n) (divides n 24)) (range 0 100)))
        60

As one last example I asked how we could produce a "countdown" string along the lines of "10-9-8-7...2-1-0-blastoff!". Getting the numbers 0 to 10 is easy:

        > (range 0 10)
        '(0 1 2 3 4 5 6 7 8 9 10)

We mapped a function over this list to convert the integers to strings:

        > (map number->string (range 0 10))
        '("0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10")

We can concatenate these together using string-append and folding from the left to get the numbers in reverse order:

        > (foldl string-append "" (map number->string (range 0 10)))
        "109876543210"

But this didn't include our dashes or the blastoff string. We introduced a lambda for that and changed the initial value in the call to foldl:

        > (foldl (lambda (s1 s2) (string-append s1 "-" s2)) "blastoff!" (map number->string (range 0 10)))
        "10-9-8-7-6-5-4-3-2-1-0-blastoff!"

This works, but it's inefficient in that it traverses the list to map the converstion procedure and then traverses again in the call on foldl. We can avoid the double traversal by giving a more sophisticated lambda that takes a number and a string as parameters:

        > (foldl (lambda (n s) (string-append (number->string n) "-" s)) "blastoff!" (range 0 10))
        "10-9-8-7-6-5-4-3-2-1-0-blastoff!"

This example is worth studying carefully. We are used to using operations like addition where the folding function takes two values of the same type. Here we have a function that takes a number as its first parameter and a string as its second parameter. The folding procedures are flexibile enough in both Scheme and OCaml to allow this. Our simple reduce function did not, mostly because it didn't have that extra parameter that indicated what initial value to use.

Stuart Reges

Last modified: Fri May 3 14:38:13 PDT 2024