Notes for June 29, 2005

(courtesy of Stuart Reges)

LinkedIntList

In studying linked lists, we are getting our first good example of the concept of "abstraction" and particularly "data abstraction". We have spent a lot of time studying how to create a class called IntList whose instances each store a list of integers by keeping track of an array and a size. Now, we are going to see how the same functionality can be built using a linked list, in which case the only data field you need is a reference to the front of the list. So the old IntList had the following data fields declared:

    public class IntList {
        private int[] elementData;
        private int size;
    
        <methods>
    }
For these new lists, we just need a data field of type ListNode:
    public class LinkedIntList {
        private ListNode front;

        <methods>
    }
So these are two fundamentally different ways to get the same kind of behavior. From the point of view of a client, these two classes accomplish the same thing and in some sense we don't care how each does its job as long as it does it correctly. But from the point of view of the implementor of the class, the details are quite different.

Print

How does one write code that would print out the contents of one of these lists? We have just one variable to work with, so that's clearly where we have to start (the variable front or this.front). We could use it to move along the list and print things out, but then we would lose the original value of the variable which would mean that we would have lost the list. Instead, we declare a local variable of type ListNode that we will use to access the different data fields of the list:

    ListNode current = front;
This initializes current to point to the same value as front (the first node in the list). We want to have a loop that prints the various values and we want it to keep going as long as there is more data to print. Suppose that the list stores the values (3, 5, 2). Then after executing the statement above, we have the following situation:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                           ^
         +---+             |
 current | +-+------>------+
         +---+
So how do we structure our loop? We want to keep going while there is more data to print. The variable current will end up referring to each different node in turn. The final node has the value null in its next field, so eventually the variable current will become null and that's when we know we're done. So our basic loop structure will be:
    ListNode current = front;
    while (current != null) {
        <process next value>
    }
To process a node, we need to print out its value, which we can get from current.data, and we need to move current to the next node over. The position of the next node is stored in current.next, so moving to that next node involves setting current to current.next:
    ListNode current = front;
    while (current != null) {
        System.out.print(current.data + " ");
        current = current.next;
    }
The first time through this loop, current is referring to the node with the 3 in it. It prints this value and then resets current, which causes current to refer to (or point to) the second node in the list:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                                                ^
         +---+                                  |
 current | +-+------>------>------>------>------+
         +---+
Some people prefer to visualize this differently. Instead of thinking of current as sitting still while its arrow moves, some people prefer to think of the variable itself moving. So for the initial situation, they'd draw this picture:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                           ^
                           |          
                         +-+-+
                 current | + |
                         +---+
And after executing the statement "current = current.next", we'd have this situation:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                                                ^
                                                |          
                                              +-+-+
                                      current | + |
                                              +---+
Either way of thinking about this works. Because in this new situation, current is not null, we once again go into the loop and print out current.data (which is now 5), and move current along again:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                                                                     ^
         +---+                                                       |
 current | +-+------>------>------>------>------>------>------>------+
         +---+
Once again, current is not null, so we go into the loop a third time and print the value of current.data (2) and reset current. But this time current.next has the value null, so when we reset current we get:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+

         +---+
 current | / |
         +---+
Because current has become null, we break out of the loop having produced the following output:
    3 5 2
I pointed out that the corresponding array code would look like this:
    int i = 0;
    while (i < size) {
        System.out.print(elementData[i] + " ");
        i++;
    }
Assuming you have some comfort with array-style programming, this might give you some useful insight into linked list programming. There are direct parallels here in terms of typical code:

Array/List Equivalents
Description Array code Linked list code
go to front of the list int i = 0; ListNode current = front;
test for more elements i < size current != null
current value elementData[i] current.data
go to next element i++; current = current.next;

In fact, knowing that we like to use for loops for array processing, you can imagine writing for loops for the processing of linked lists as well. The code above could be rewritten as:

    for (ListNode current = front; current != null; current = current.next) {
        System.out.print(current.data + " ");
    }
Some people like to write their list code this way. It's an issue of personal taste.

Append

Next I turned to the question of how we would implement the appending add operation from the old IntList class for our new LinkedIntList class. The method is supposed to append the new value at the end of the list, which means we have to locate the end of the list. Suppose we have the list above and want to add the value 4 at the end of the list. First, we have to get there. So here's a start:
    ListNode current = front;
    while (current != null) {
        current = current.next;
    }
What happens is that the variable current moves along the list from the first to the last node until it becomes null, leaving us in this situation:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+

         +---+
 current | / |
         +---+
Some people think that we could then execute this line of code to complete the task:
    current = new ListNode(4);
But that won't work! It leaves us in this situation:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+

                    +------+------+
         +---+      | data | next |
 current | +-+--->  |   4  |   /  |
         +---+      +------+------+
This allocates a new node, but this new node has no connection to the original list. The list is still composed of 3 nodes linked together. This fourth node has been constructed, but hasn't been properly linked into the list. Here's a way to think about it. If you want to know what is in the list, you start with the variable front and then from each node you follow its next link to figure out where to go next. That means that there are only two ways to change the contents of the list: To solve this problem, we have to stop one position early. We don't want to run off the end of the list as we did with the printing code. Instead, we want to position current to the final element. We can do this by changing our test. Instead of going until current becomes null, we want to go until current.next is null, because only the last node of the list will have a next field that is null:
    ListNode current = front;
    while (current != null) {
        current = current.next;
    }
This will get us to the following situation:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                                                                     ^
         +---+                                                       |
 current | +-+------>------>------>------>------>------>------>------+
         +---+
At which point we want to change that null reference to point to the new node we want to construct:
    current.next = new ListNode(4);
Which leaves us in this situation:
                                         +------+------+      +------+------+
         +---+                           | data | next |      | data | next |
   front | +-+--->     ...     +--+--->  |   2  |   +--+--->  |   4  |   /  |
         +---+                           +------+------+      +------+------+
                                                ^
         +---+                                  |
 current | +-+------>------>------>------>------+
         +---+
The node with 2 in it is pointing at the node with 4 in it. Notice that the line of code we executed changed the value of current.next, not current itself. But now this code presents a new problem. The while loop has a test to see if current.next is null. But what if there is no current.next? What if current is null? This is one of the biggest headaches in writing linked list code that we have to constantly think about this possibility. When we don't think about it, we're likely to have our programs terminate with a NullPointerException. This situation can, in fact, arise. If the list is empty initially, then we will set current to null and we'll get this error. So we have to have a special test for that case. Here is the complete code written in a general way as the add method:
    public void add(int value) {
        if (front == null) {
            front = new ListNode(value);
        } else {
            ListNode current = front;
            while (current.next != null) {
                current = current.next;
            }
            current.next = new ListNode(value);
        }
    }

addSorted

Suppose the list is in sorted order and we want to add a new value so as to preserve the sorted order. So suppose we have a variable called value that is 10 and we want to insert it into this list:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   2  |   +--+--->  |   5  |   +--+--->  |  12  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
How do we do it? We need to find the right spot to insert it and that is going to depend on the value we are inserting (10 in our example). We have to compare it against the various data values stored in this list. The new node does not belong in front of the node with 2 in it because 2 is less than 10. Similarly, it does not belong in front of the node with 5 in it, because 5 is less than 10. But it does belong in front of the node with 12 in it, because 12 is not less than 10. An initial attempt at the code is as follows:
    ListNode current = front;
    while (current.data < value) {
        current = current.next;
    }
This has the core of the right idea, but it has many problems. First of all, it ends up positioning us in the wrong spot. As in the appending case, we want to stop one position early to be able to add something to the list. We do not want to have our variable current end up referring to the node that has 12 in it. We have to have current pointing to the node that has 5 in it. So we have to modify the code to stop one position early. This can be done by changing the test to involve current.next instead of current:
    ListNode current = front;
    while (current.next.data < value) {
        current = current.next;
    }
This theoretically stops with current referring to the node with 5 in it, which means we can link in the new node by changing current.next. This new node should have a data value of value (10 in our example). What should its next link refer to? The answer is that it should refer to the node that has 12 in it, which is stored in current.next. So we want to construct the node in this way:
    new ListNode(value, current.next)
Just calling the constructor leaves us in this situation:
                                                      +------+------+
                                                      | data | next |
                                                      |  10  |   +  |
                                                      +------+---|--+
                                                                 |
                                                                 V
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   2  |   +--+--->  |   5  |   +--+--->  |  12  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                                                ^
         +---+                                  |
 current | +-+------>------>------>------>------+
         +---+
This isn't enough. We've constructed a node that points at the list, but nothing in the list points at the node. So we've taken care of half of what we have to do. The other half is to change a link of the list to point to the new node. The link to change is current.next:
    current.next = new ListNode(value, current.next);
which leads to this situation:
                                                      +------+------+
                                                      | data | next |
                                                      |  10  |   +  |
                                                      +------+---|--+
                                                           ^     |
                                                           |     V
                    +------+------+      +------+------+   |  +------+------+
         +---+      | data | next |      | data | next |   |  | data | next |
   front | +-+--->  |   2  |   +--+--->  |   5  |   +--+---+  |  12  |   /  |
         +---+      +------+------+      +------+------+      +------+------+
                                                ^
         +---+                                  |
 current | +-+------>------>------>------>------+
         +---+
This isn't the easiest picture to read, but if you follow the links carefully, you'll see that starting at front the sequence of values you see is: 2, 5, 10, 12, which is what we want.

So the code sort of works. But there are some special cases we haven't thought about. What if you want to insert the value 13? Remember our loop test:

    while (current.next.data < value)
This depends on finding a value in the list that is less than the value we are trying to insert. What if there is no such value, as in the case of inserting 13? This code keeps moving current forward until this test eventually throws a NullPointerException when current is pointing at the last node in the list. But if the value is greater than everything else in the list, then it belongs after the last node in the list. So we want to stop when current gets to the last node in the list (similar to what we did with the appending add). So a second attempt at the test would be:
    while (current.next.data < value && current.next != null)
But even this doesn't work. It would stop current at the right place, but when current.next is null, we can't ask for the value of current.next.data. That test will throw a NullPointerException. This is another example of a combination of a sensitive and robust test:
    while (current.next.data < value && current.next != null)
           ~~~~~~~~~~~~~~~~~~~~~~~~~    ~~~~~~~~~~~~~~~~~~~~
                sensitive test              robust test
We need to switch the order of these tests to make them work right.
    while (current.next != null && current.next.data < value)
Java uses short-circuited evaluation, which means that if the first test evaluates to false, Java doesn't bother to perform the second test. So the first test, in effect, protects you from the potential problem generated by the second test (the NullPointerException). Putting this all together, we have the following solution to the problem:
    ListNode current = front;
    while (current.next != null && current.next.data < value) {
        current = current.next;
    }
    current.next = new ListNode(value, current.next);
But even this code is not enough. The first test in this loop is the robust test, but it isn't all that robust. If current is null, then it throws a NullPointerException. So we want to execute this code only in the case where front isn't null.

There was another case as well. If the value belongs at the front of the list, then this code places it in the wrong spot. It always inserts after a node currently in the list, never placing it in front of all nodes. The solution is to check if the list is currently empty or if the value of the first node is greater than the value to be inserted:

    if (front == null || front.data >= value) {
        front = new ListNode(value, front);
    } else {
        ListNode current = front;
        while (current.next != null && current.next.data < value) {
            current = current.next;
	}
        current.next = new ListNode(value, current.next);
    }
The initial test deals with the two special cases just mentioned. If the list is currently empty (front == null) or if the value belongs at the front of the list (front.data >= value), then we insert at the front of the list rather than using the other code we developed. This if/else matches the description above that states that changing the list either involves changing the value of front (the if part) or the value of <something>.next (the else part).

Order is important in this test as well because the test involving front.data will throw a NullPointerException if front is null.

This is a good example to study because it has so many special cases. In writing our code we had to deal with:

The first two of these cases are handled by the if branch of the code:
    if (front == null || front.data >= value)
        ~~~~~~~~~~~~~    ~~~~~~~~~~~~~~~~~~~
         empty list         front of list
and the second two cases are handled in the else branch of the code:
    while (current.next != null && current.next.data < value)
           ~~~~~~~~~~~~~~~~~~~~    ~~~~~~~~~~~~~~~~~~~~~~~~~
               back of list             middle of list
In another approach to this kind of problem, you keep a 2-element window on the list using variables called prev and current:
                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
   front | +-+--->  |   3  |   +--+--->  |   5  |   +--+--->  |   2  |   /  |
         +---+      +------+------+      +------+------+      +------+------+

                           ^                    ^
                           |                    |
                         +-+-+                +-+-+
                    prev | + |        current | + |
                         +---+                +---+
The variables move in lockstep. The code you'd use to move this pair of variables forward one spot:
    prev = current;
    current = current.next;
By using this pair of pointers, the while loop test can be worded in a more intuitive way in terms of current:
    while (current != null && current.data < value)
And we still have the option after the loop to use the prev variable to refer to the node just before this one:
    prev.next = new ListNode(value, prev.next);
The code for this approach was included in handout #8.