CSE143 Notes for Wednesday, 10/5/05

I began the lecture by reviewing the first version of the IntList class. I pointed out that we had been looking at it for several lectures and discussing it in section, but I wanted to review some of the important details.

I also repeated my point that we are using the IntList class as a case study to understand how objects work in Java and, in particular, how data structure objects work in Java (what are known as "collections" or "collections classes"). I reminded people of the idea of using IntList as a kind of "software cadaver" that we are cutting open to examine. We're trying to reach a point where we understand the Java class known as ArrayList. That class is sufficiently complex that I'm using IntList instead and I'm going to be showing successively more complex versions of it as we learn more about how these classes work.

I reminded people that we started out with two variables and five code fragments that we examined in our first discussion section. We turned the two variables into data fields and we turned the five code fragments into methods and we put all of this inside a class called IntList.

I took a moment to remind people that in Java you can "overload" a method. The idea is that you can have more than one method of the same name. For example, among those five original methods, we have two add methods:

        // post: appends the given value to the end of the list
        public void add(int value)

        // post: inserts the given value at the given index, shifting
        //       subsequent values right
        public void add(int index, int value)

The first is the appending add that takes just a value and that appends it to the end of the list. The second is the inserting add that takes both an index and a value and that inserts the value in the middle of the list. Both methods are called add. This is okay because they have different signatures. The signature of a method is defined by the method name and the number and types of its parameters. These methods both have the same name, but they still have different signatures because one takes a single int argument and the other takes two int arguments. The compiler can tell from the call on the method which one you want to use.

In addition to the five original methods, we added two constructor methods that would allow clients to create actual IntList objects. One is a "real" constructor that specifies how to initialize the data fields of the object:

        public IntList(int capacity) {
            this.elementData = new int[capacity];
            this.size = 0;
        }

The other constructor calls this constructor using the "this" keyword. Whenever Java sees the keyword "this" followed by parentheses, it interprets that as one constructor calling another constructor. You can include this only as the first line of a constructor, as in the other IntList constructor:

        public IntList() {
            this(DEFAULT_CAPACITY);
        }

Java can tell that this is a call on the other constructor because it includes an int inside parentheses. In other words, this is the zero-argument constructor calling the one-argument constructor. This is the normal pattern in Java classes. There is normally one "real" constructor that has the detailed code and any other constructors simply call this primary constructor.

To make sure that the IntList was properly encapsulated, we declared the two variables to be private data fields. We spent a few minutes discussing the good and bad aspects of this. On the good side, this prevents people from changing the variables in an inappropriate way. For example, if the size variable were public, someone could set it to -8. Obviously the size should never be a negative number. By declaring the size variable to be private, we prevent this from happening.

Another good thing about this is that by making the variables private, we make it possible to change the data fields later. We can choose to implement the list in a different way and the client never needs to know about it. I mentioned that Joshua Bloch who designed the collections classes at Sun has written about this in his book Effective Java. He describes how Sun has been unable to optimize certain programs because of some data fields that they decided many years ago to make public in a class called Dimension. That decision has prevented them from changing the class to be more efficient.

But what about the downside? The most obvious one is that by declaring the variables to be private, we have prevented any client from using them. But clients might have good reasons to want to manipulate these values. For example, we want clients to be able to find out how many elements are in the list. But instead of providing direct access to a variable like "size", we provide a method that allows the client to access the value. That's why the original IntList class had two extra methods called get and size that were included to allow the client to have limited access to the private data fields.

I then spent a few minutes talking about the first programming assignment. It is a minor variation of the IntList class, so it has a lot of code in common. One big difference is that the new class keeps the list in sorted order. We call this "nondecreasing" order because there might be duplicates. We refer to the fact that the list is always sorted as a "data invariant" of the class. A data invariant is a relationship that never changes. We can guarantee that this invariant is true because we have control over how the client accesses the object. In particular, we no longer want to have a public method for inserting a value in the middle of the list. We have a single "add" method that adds the value in the appropriate place to preserve sorted order.

Another difference about the SortedIntList is that it has an extra piece of state information. We allow the client to ask for a unique list. But this is something that can be turned on or off, like a toggle switch or an on/off button that you press. The client can flip the switch one way to say, "Please make sure the values are unique", or the client can flip the switch the other way to say, "Now I don't care whether the values are unique." That's the purpose the setUnique method and the various constructors--to allow the user to specify when to limit the list to a unique set of values. We also include a getUnique method to allow the client to examine the current setting for this extra bit of state.

Then I passed out a new version of IntList (handout #5) and spent the rest of the hour discussing various aspects of the code. I spent some time talking about the concept of preconditions and postconditions. This is a way of describing the contract that a method has with the client. Preconditions are assumptions the method makes. They are a way of describing any dependencies that the method has ("this has to be true for me to do my work"). Postconditions describe what the method accomplishes assuming the preconditions are met ("I'll do this as long as the preconditions are met.").

I have included pre/post comments on all of my IntList methods. I encourage people to use this style of commenting. It is not required, but if you use a different style, be sure that you have addressed the preconditions and postconditions of each method in the comments for the method.

As an example, I pointed out that methods like "get" that are passed an index assume that the index is legal. The method wouldn't know how to get something that is at a negative index or at an index that is beyond the size of the list. Whenever you find a method that has this kind of dependence, you should document it as a precondition of the method. It is also a common practice in Java to "throw" an exception if a precondition is not met. This is a clean way to stop the execution of the program and force the client of the code to fix their call on the method.

I pointed out that the new version of IntList throws exceptions in many places when preconditions are violated. The general syntax for throwing an exception is:

throw <exception>; Exceptions are objects and they are constructed at the time that the error is discovered. This means that we'll always need to include "new" and we have to decide what kind of object to construct when we throw an exception. There are many standard exception classes in the Java class libraries. It would be good to start to learn these standard classes. In the IntList class, we see two of them. We throw an IllegalArgumentException when we are passed an inappropriate value as an argument (like a negative capacity). We throw an IndexOutOfBoundsException when we are passed an illegal index.

As an example, we looked at the add method that takes an index and saw the following code at the beginning of the method:

        if (index < 0 || index > this.size)
            throw new IndexOutOfBoundsException("illegal index");

Notice that we pass some text to the constructor as a String. This is optional. All of the standard Java exception classes allow you to specify extra information like this or you can just call the zero-argument constructor for the class.

Someone asked if that is the right test. Should we allow index to be equal to this.size? This is an interesting detail of the IntList class. The legal indexes are 0 through (size - 1), but for this particular method, it actually makes sense to use the index "size". It would imply that you want to append to the end of the list. So this method has a slightly different precondition than the other methods like "get" that deal with an index.

I mentioned that in some cases it is not practical to check the precondition of a method. For example, the binary search technique we are using relies on the list being in sorted order. But our very fast binary search technique would grind to a halt if each time we called it we checked to make sure the list is in sorted order. So we don't always check preconditions, but when it's easy to do so, we should.

I spent some time discussing the idea that there are three kinds of methods: constructors, accessors and mutators. I pointed out that in the documentation for the IntList class I list them in this order with a blank line between each group of methods. Constructors are special methods that can be called only in conjunction with "new". Accessors are "read only" methods that examine the state of an object but don't change it. Sometimes people refer to these as "getters". Mutators are methods that potentially change the state of an object (in other words, they are read/write operations). They are sometimes referred to as "setters".

I pointed out that in addition to the "get" method, this version of IntList also has a "set" method. I also pointed out that both of these methods and the remove method have the same test for an illegal index. I moved this into a private method called checkIndex that each of them calls. This is an important point. You can have as many private methods as you want in a class. They are not considered part of the interface of the class because a client can't access them. They are considered to be part of the "innards" of the encapsulated object, not visible to the outside.

I also pointed out that this new version "grows" if it needs to. It still has an internal capacity that you can set in the constructor, but it checks to see if it needs more space and increases the size of the array if necessary. The code for this is in a new method called ensureCapacity. This method is publicly available so that the client can make this request and it is used internally when methods like "add" are called to make sure that there is space for any values added to the list. If the array becomes full, the ensureCapacity method allocates a new array that is twice as large as the original and copies values from the old array to the new one.

Stuart Reges

Last modified: Sun Oct 9 14:57:54 PDT 2005