Notes for July 8, 2005

(courtesy of Stuart Reges)

Consider the following hierarchy of different kinds of employees:

                     Employee
                   /         \
                 /             \
            Clerical        Professional
               |             /       \
               |            /         \
           Secretary    Lawyers    Engineers
               |
               |
        Legal Secretary
Everyone in the company is an employee, but we divide them into clerical versus professional employees. Among the professionals, we have lawyers and engineers. Among the clericals we have secretaries and a variation of a secretary known as a legal secretary. The idea is that this is part of a company's hierarchy of employees. Obviously, an actual company would have many other kinds of employees as well.

Most companies have an employee orientation that all employees attend. Suppose that at that orientation there is a 19-page booklet given out that describes general employee procedures (insurance, retirement, vacations, etc). By default, an employee would assume that those are the policies that apply to that employee. But what tends to happen is that the day after the employee orientation they're told some more details. For example, the lawyers might have their own 3-page booklet. That booklet can have two kinds of things. It might have additional procedures that aren't part of the 19-page booklet. And it might have replacement procedures (what we call overriding). For example, they might tell a new lawyer, "I'm sure they told you yesterday at the orientation that when you want to take a vacation to fill out the yellow form. We don't use the yellow form here. We have our own form that's pink."

This is similar to what happens in Java with inheritance. You establish an inheritance relationship between two classes with the extends keyword in the class header:

    class A extends B {
        ...
    }
In our hierarchy diagram, we'd put A under B because it extends it:
        B
        |
        |
        A
We refer to A as a subclass of B and B as the superclass of A. This use of sub and super is somewhat unfortunate because it has the opposite meaning to how we use the words in English. With these inheritance hierarchies, we put the more specialized version below (as the "sub") and the more generic one above (as the "super"). For example, if we had a hierarchy for burgers and cheeseburgers, we'd put the burger on top because it's the more generic one:
      burger
        |
        |
   cheeseburger
But think about this: Someone says, "You can have a cheeseburger or you can have a super cheeseburger." If you really like cheese, you would tend to ask for a super cheeseburger. So imagine your surprise when you find that the super cheeseburger has no cheese! Yet in the standard terminology we'd refer to "burger" as the superclass of cheeseburger.

Another pair of words that people sometimes use are to refer to the superclass as the base class and the subclass as the derived class. This is a little clearer in terms of the superclass being simpler (base). In fact, the C# programming language has a keyword base that has the same meaning as a Java keyword super.

In an inheritance hierarchy, any state and behavior (i.e., any data fields and methods) in the superclass are automatically included in every subclass. In other words, saying extends Foo automatically gives a class all of the data fields and methods of the Foo class.


Addendum by benson:

What does it mean to be "included"? It is true that all data fields and methods are "included" in the subclass, in the sense that each object of the subclass type includes all the information from its superclasses inside of it. As to whether a method defined in the subclass can access the variable is another question. As metioned in class, access is restricted (for our purposes) to those specified as public. As an example, consider the following classes:

    public class B {
        private int privateVar;
        public int publicVar;

        private int privateMethod() {
            return privateVar;
        }

        public int publicMethod() {
            return privateVar;
        }
    }

    public class A extends B {
        public void method() {
            int temp = publicVar + privateVar;
            System.out.println(publicMethod());
            System.out.println(privateMethod());
        }
    }
If you compile the code, it complains with:
    A.java:3: privateVar has private access in B
            int temp = publicVar + privateVar;
                                   ^
    A.java:5: cannot find symbol
    symbol  : method privateMethod()
    location: class A
            System.out.println(privateMethod());
                               ^
    2 errors
Notice that the compilation does not complain that A.method() calls the inherited method publicMethod() which accesses a private variable in B. That method is defined in class B which has legitimate access to the private variable.

A subclass can do two things:

In the employee analogy, this is like the 3-page booklet that the lawyers have that defines new procedures that apply only to lawyers and that overrides inherited behavior ("We use our own pink form instead of the standard yellow form").

Substitutability

Suppose that you are running a temp agency and you charge people $10 per hour for a secretary and $12 an hour for a legal secretary. One day a customer asks you to send over a secretary and you find that you have no generic secretaries to send over, but you have a legal secretary that otherwise wouldn't be working that day. So you decide to send the legal secretary even though the request was for a generic secretary. This works because of the notion of substituting. A legal secretary can substitute for a secretary. But suppose that during a coffee break the employer figures out that the person you sent over is actually a legal secretary and the employer says, "Great, I have some legal secretarial work I want you to do."

This is not ok. Why not? That employer asked for a secretary and is paying $10 an hour. You happened to send over someone who can do more, but that doesn't mean that employer has the right to change the contract and ask the person to do more than the contract is for. If that employer wants the person to do legal secretary work, then we need to renegotiate the contract and that employer needs to pay $12 an hour for that work.

This is exactly what is happening in Java when you use a class cast. You are renegotiating the contract for what you can ask that object to do.

When exactly is a substitute appropriate? For example, you might imagine a hierarchy for vehicles that included bikes and cars. Among the cars you might have Hondas. Among the Hondas you might have Honda Accords. And among the Accords you might have a luxury version called an LX. And among the Accord LX models, you might have a variation that is known as "luxury package 319". We'd draw a hierarchy something like this to capture these variations:

                   Vehicle
                  /       \
                 /         \
               Car         Bike
                |
                |
              Honda
                |
                |
              Accord
                |
                |
            Accord LX
                |
                |
      Accord LX package 319

The point is that the most generic or simple description appears high in the hierarchy. The more complex, more sophisticated objects appear low in the hierarchy. That's because at each level we are adding potentially more and more state and behavior.

When can one object substitute for another? Inheritance should be used only when there is an is-a relationship where the more specialized object can substitute for the less specialized one. So in this hierarchy, the Accord LX luxury package 319 can take the place of anything above it but the things above it can't take its place. This matches our intuition in most cases. If we were expecting a generic Accord and we instead got a luxury Accord, we're not going to complain. But if we paid for the luxury car and instead got the generic car, then we wouldn't be happy.

We also can't substitute across. If we were expecting a luxury Accord, we aren't going to be happy with a bike. Of course, the analogy isn't perfect. In real life if we were expecting a bike, we might be satisfied with a luxury Accord instead, even though with inheritance hierarchies, that wouldn't be allowed.

We think of each of these different entries as "roles". The idea is that an object can fill many roles. An Accord LX luxury package 319 can fill the role of an Accord LX luxury package 319 because it is one. But it can also fill the role of an ordinary Accord LX and it can also fill the role of a generic Accord and it can also fill the role of a Honda and it can fill the role of a car and it can fill the role of vehicle. In general, an object call fill every role that appears as you go up the inheritance chain to the top.

Handout #13

By looking at the class headers and the extends clauses, we can figure out that the inheritance hierarchy looks like this:
    One
   /   \
  /     \
Two    Three
         |
         |
        Four
It is helpful to make a table that keeps track of what definition (if any) each class has for method1, method2 and method3. Starting with the class One, we find that it defines method1 as producing the output "One1". It has no definition for method2 and method3. That means that the "One" role does not include a method2 or method3. This will be important later in solving this problem.

The Two class provides a definition for method3 that prints out "Two3". It has no other definitions, but it inherits a method1 from One that prints "One1". It has no method2.

The Three class provides a definition for method2 that prints "Three2" and then calls method1. There is no definition for method1 in this class, but it inherits one from the One class that prints "One1". Since method2 prints "Three2" and then calls method1, you might be inclined to say that its output is two lines: "Three2/One1". But that won't always be the case because of what's known as polymorphism. Java is a dynamic language where methods can be redefined. So for a Three object, method2 prints those two lines of output. But it won't necessarily behave that way for all objects because method1 might be redefined. The Three class has no definition for method3.

Finally, the Four class defines a method1 and a method3. In method1 of the Four class we see something new, a use of the keyword super. In this context, super is being used to call an overridden method. This class is giving a new definition to method1, but in doing so, it can call the original version of the method in the superclass by using the keyword super. You can think of the keyword super as an alternative to this. If you say super.method1() you are asking for the version of method1 in the superclass. If you say this.method1() or just method1, you're asking for the version of method1 in this class.

super is statically bound in that you know exactly what method is being called. Previously, we were careful not to make assumptions about which method1 would be called because that involved a call on this.method1 where polymorphism enters into things. Here we know exactly what method is being called, the version of method1 in the superclass of the Four class. Actually, there is no definition of method1 in the superclass of Four (class Three), so to find it, we keep looking up the inheritance chain until we find the definition in the class One.

So we know that method1 prints out "Four1/One1". method2 is the inherited method that prints out "Three2" and then calls method1. And method3 prints out "Four3". So the Four class is the only class that has all three methods defined.

The rest of the problem involves several variables that are defined and a series of calls using those variables. There is a three step process to go through to figure out these calls:

The first six problems involved calling method1 on each of the six variables without any casting going on. The first question you have to consider is whether you pass the compiler check. To figure that out, you have to look at the types of the variables. The types of the objects don't matter to the compiler. The variables determine the contract. The variables of type One and Three are okay because both the One class and the Three class include a method1. But the two variables of type Object are a problem because the Object class does not include a method1. Even though the objects themselves can do this, the contract was for a generic Object, so the compiler is going to complain. This is an exact parallel of our employer who asked for a secretary and got a legal secretary. Even though the legal secretary can do more sophisticated work, the employer isn't allowed to ask for that because the contract is for a $10/hour generic secretary, not for a legal secretary.

So the fifth and sixth answers are "compiler error". The first four pass the compiler and have no casting, so we don't have to worry about runtime errors. The only thing left is to figure out what the individual objects do when method1 is called.

Answers to handout #13
Call Output Discussion
var1.method1(); One1 variable is of type One, One role includes method1, no cast, actual object is a Two which writes out "One1" when method1 is called
var2.method1(); One1 variable is of type One, One role includes method1, no cast, actual object is a Three which writes out "One1" when method1 is called
var3.method1(); Four1/One1 variable is of type One, One role includes method1, no cast, actual object is a Four which writes out "Four1/One1" when method1 is called
var4.method1(); Four1/One1 variable is of type Three, Three role includes method1, no cast, actual object is a Four which writes out "Four1/One1" when method1 is called
var5.method1(); compiler error variable is of type Object, Object role does not include method1
var6.method1(); compiler error variable is of type Object, Object role does not include method1
var4.method2(); Three2/Four1/One1 variable is of type Three, Three role includes method2, no cast, actual object is a Four which writes out "Three2/Four1/One1" when method2 is called (note that method2 calls its method1 polymorphically, which is why this output includes "Four1")
var4.method3(); compiler error variable is of type Three, Three role does not include method3 (even though the object itself is a Four that is capable of performing this action)
((Two)var1).method2(); compiler error because of cast we pay attention to it rather than the variable type (because we have renegotiated the contract), cast is to Two, Two role does not include method2
((Three)var1).method2(); runtime error cast is to Three, Three role includes method2 so we pass the compiler, but actual object is a Two which can't fill the Three role (casting across the hierarchy, like asking someone to accept a bike when they were expecting a car), so we get a runtime error
((Two)var1).method3(); Two3 cast is to Two, Two role includes method3, actual object is a Two which can fill the Two role, so the cast is okay, and a Two object writes "Two3" when its method3 is called
((Four)var2).method1(); runtime error cast is to Four, Four role includes method1, actual object is a Three which can't fill the Four role; this was, in essence, a stupid cast to do because it isn't necessary, but if you tell this kind of lie, Java will complain
((Four)var3).method1(); Four1/One1 cast is to Four, Four role includes method1, actual object is a Four which can fill the Four role, so cast is okay and a Four object writes "Four1/One1" when method1 is called
((Four)var4).method3(); Four3 cast is to Four, Four role includes method3, actual object is a Four which can fill the Four role, so cast is okay and a Four object writes "Four3" when method3 is called
((Two)var4).method3(); compiler error cast is to Two, variable is of type Three, compiler is smart enough to recognize that this won't work
((One)var5).method1(); One1 cast is to One, One role includes method1, actual object is a Three, which can fill the One role, so cast is okay and a Three object writes "One1" when method1 is called
((Four)var5).method2(); runtime error cast is to Four, Four role includes method2, actual object is a Three which can't fill the Four role
((Three)var5).method2(); Three2/One1 cast is to Three, Three role includes method2, actual object is a Three which can fill the Three role and a Three object writes "Three2/One1" when method2 is called
((One)var6).method1(); One1 cast is to One, One role includes method1, actual object is a One which can fill the One role, so cast is okay and a One object writes "One1" when method1 is called
((One)var6).method2(); compiler error cast is to One, One role does not include method2
((Two)var6).method3(); runtime error cast is to Two, Two role includes method3, actual object is a One, which can't fill the Two role