CSE143 Notes for Wednesday, 10/19/05

I began by discussing a new feature of Java 5 known as "generics". I started by reviewing typical code that we would have written with earlier versions of Java to use an ArrayList to store Strings:

        ArrayList list = new ArrayList();
        list.add("hello");
        list.add("there");

In this case I'm adding String objects to the ArrayList, but I can add any kind of objects to the list. I asked people how this works and several people mentioned that it is defined in terms of a class called "Object". Every reference type in Java (i.e., every object) derives from the Object class one way or another. The "add" method works because its parameter type is Object:

        public void add(Object value) {
            ...
        }

But what happens when we try to remove something from an ArrayList? This turns out to be more tricky because the return type for methods like "get" is also defined in terms of Object:

        public Object get(int index) {
            ...
        }

So with my example above, if I wanted to print the length of the first String that I added to the list, I can't just say:

        System.out.println("length = " + list.get(0).length());

I know that list.get(0) returns a reference to a String object because I know that I put a String into the structure, but Java doesn't know it's a String object. From Java's point of view, it just knows that it's of type Object. So this code wouldn't compile, because Java doesn't know that it's a String, so it doesn't know that it has a length method. To fix this we need a cast, but even this doesn't work:

        System.out.println("length = " + (String) list.get(0).length());

The problem here is that the cast has a lower level of precedence than the dot notation, so this is interpreted as "Figure out what list.get(0).length() is, then cast the result to String." So we'd need an extra set of parentheses:

        System.out.println("length = " + ((String) list.get(0)).length());

This is finally code that compiles and reports the length of the first String in the list. I pointed out that this the casting is tedious and confusing. This is where the old style of coding ends up being ugly.

Java has a new way to do this starting with Java 5. Instead of declaring ordinary ArrayList objects, we declare ArrayList<E> where E is some type (think of E as being short for "Element type"). This is similar to what are known as templates in C++. The "E" is a type parameter that can be filled in with the name of any class.

In our case, we want an ArrayList of Strings. We describe the type as:

        ArrayList<String>

Consider our old line of code for constructing this (I've underlined the type information):

        ArrayList list = new ArrayList();
        ~~~~~~~~~            ~~~~~~~~~
          type                 type

So we replace the type "ArrayList" with the type "ArrayList<String>" and we end up with:

        ArrayList<String> list = new ArrayList<String>();
        ~~~~~~~~~~~~~~~~~            ~~~~~~~~~~~~~~~~~
              type                         type

This line of code can be confusing, but when you remember that the "<String>" is part of the type, it makes sense. The rest of the code is the same except for the fact that we no longer need a cast for the call on get because Java knows that we're dealing with an ArrayList of String objects:

        list.add("hello");
        list.add("there");
        System.out.println("length = " + list.get(0).length());

So with generics we don't have to worry about casting, but we have a more complex type to deal with in variable declarations and in calls on the constructor.

I then spent a little time discussing the issue of primitive data versus objects. Even though we can construct an ArrayList<E> for any class E, we can't construct an ArrayList<int> because int is a primitive type, not a class. To get around this problem, Java has a set of classes that are known as "wrapper" classes that "wrap up" primitive values like ints to make them an object. It's very much like taking a candy and putting a wrapper around it. For the case of ints, there is a class known as Integer that can be used to store an individual int. Each Integer object has a single data field: the int that it wrapped up inside.

Java 5 also has quite a bit of support that makes a lot of this invisible to programmers. If you want to put int values into an ArrayList, you have to remember to use the type ArrayList<Integer> rather than ArrayList<int>, but otherwise Java does a lot of things for you. For example, you can construct such a list and add simple int values to it:

        ArrayList<Integer> list = new ArrayList<Integer>();
        list.add(18);
        list.add(34);

In the two calls on add, we are passing simple ints as arguments to something that really requires an Integer. This is okay as of Java 5 because Java will automatically "box" the ints for us (i.e., wrap them up in an Integer object). We can also refer to elements of this list and treat them as simple ints, as in:

        int product = list.get(0) * list.get(1);

The calls on list.get return references to Integer objects and normally you wouldn't be allowed to multiply two objects together. In this case Java automatically "unboxes" the values for you, unwrapping the Integer objects and giving you the ints that are contained inside.

Every primitive type has a corresponding wrapper class: Integer for int, Double for double, Character for char, Boolean for boolean, and so on.

Then I discussed another Java concept. In Java you can declare what is known as an "interface" as a way to describe a set of behaviors that an object can perform. In programming terms, we define a set of methods that the object will be able to perform. This turns out to be a very useful way to generalize about different classes that we write.

For example, we have seen an IntList class that stores an integer list using an array and a LinkedIntList class that does the same thing in a different way using a linked list. If we apply the notion of abstraction, we'd conclude that there is a concept of an integer list that seems to be independent of these two implementations. This is one of the most common forms of abstraction:

what something does versus
how it does it

The Java interface provides a mechanism for listing a series of methods that an object is required to implement. For example, we can take our original IntList class and the LinkedIntList variation and pull out a list of basic operations that should exist:

        public interface IntList {
            public void add(int value);
            public void add(int index, int value);
            public int indexOf(int value);
            ...
        }

This looks a lot like a class. Instead of the word "class" we use the special Java word "interface". Inside we have a series of method headers. But notice that each method is "empty". Instead of having a set of curly braces with code inside, each header ends with a semicolon instead. This is a way of saying, "This method should exist, but I'm not going to tell you how it is implemented." This syntax is borrowed from C and C++. In those languages these are known as "function prototypes" because they describe the name, parameters and return type of the function (method), but don't say anything about how it is implemented.

Once an interface has been defined, individual classes can declare that they implement the interface, as in:

        public class LinkedIntList implements IntList {
            ...
        }

Java classes are allowed to implement multiple interfaces.

Writing code using interfaces allows us to write more flexible code. For example, suppose that in addition to having the LinkedIntList above that implements IntList, we turn the array-based version of IntList into something called ArrayIntList and have it implement our IntList interface:

        public class ArrayIntList implements IntList {
            ...
        }

Then we can declare variables of type IntList, which is less specific than using a type like ArrayIntList or LinkedIntList:

        IntList list = new ArrayIntList();

This always seems odd to novices that we'd want to use the interface for the variable type rather than the class that implements the interface. But being less specific means that the rest of our code is more flexible. For example, suppose we write another 2 thousand lines of code that all refer to IntList. Then if we found ourselves wanting to switch this to use a LinkedIntList instead, we just have to change the line of code where the object is constructed:

        IntList list = new LinkedIntList();

The rest of the code would not have to be modified because it was written in terms of the interface. This is a very useful technique when programming with data structures.

Then I discussed two classic data structures known as stacks and queues. The two structures are similar in that they each store a sequence of values in a particular order. But stacks are what we call LIFO structures while queues are FIFO structures:

        stacks        queues

        L-ast         F-irst
        I-n           I-n
        F-irst        F-irst
        O-ut          O-ut

The analogy for stacks is to think of a cafeteria and how trays are stacked up. When you go to get a tray, you take the one on the top of the stack. You don't bother to try to get the one on the bottom, because you'd have to move a lot of trays to get to it. Similarly if someone brings clean trays to add to the stack, they are added on the top rather than on the bottom. The result is that stacks tend to reverse things. Each new value goes to the top of the stack, and when we take them back out, we draw from the top, so they come back out in reverse order.

The analogy for queues is to think about standing in line at the grocery store. As new people arrive, they are told to go to the back of the line. When the store is ready to help another customer, the person at the front of the line is helped. In fact, the British use the word "queue" the way we use the word "line" telling people to "queue up" or to "go to the back of the queue".

For a minimal stack, we'd need:

a way to put a value on the top of the stack (something we call "pushing" a value on the top)
a way to remove a value from the top of the stack (something we call "popping" the stack)
and a way to test whether the stack is empty

I showed people an interface with these three operations and a fourth one that turns out to be convenient that tells you how many values are currently in the stack:

        public interface Stack<E> {
            public void push(E value);
            public E pop();
            public boolean isEmpty();
            public int size();
        }

Notice that we are using Java generics to define the Stack in terms of an unspecified element type E. That way we'll be able to have a Stack<String> or Stack<Integer> or a Stack of any other kind of element type we are interested in.

For queues, we have a corresponding set of operations but they have different names. When values go into a queue we refer to it as "enqueueing" a value. When values are removed from a queue we refer to it as "dequeueing" a value. So the Queue interface looks like this:

        public interface Queue<E> {
            public void enqueue(E value);
            public E dequeue();
            public boolean isEmpty();
            public int size();
         }

These are interfaces I have written and I have developed an implementation for each. For the stack, I have a class called ArrayStack<E> and for the queue I have a class called LinkedQueue<E>. We don't really care how these implementations work. For now, we are interested in the general properties of stacks and queues, so the only thing we need to know about ArrayStack and LinkedQueue is that we use those classes when we call "new" to construct an actual object.

I did not have time to review the example code from handout #12, but I mentioned that we'd be reviewing stacks and queues in section.

Below are links to the interface files and implementation files for stacks and queues:

Stack.java, the Stack interface
Queue.java, the Queue interface
ArrayStack.java, one Stack implementation (we could imagine defining many such implementations)
LinkedQueue.java, one Queue implementation (we could imagine defining many such implementations)

Stuart Reges

Last modified: Tue Oct 25 15:05:52 PDT 2005