Types
Types are one of the central concepts in programming languages. Types
initially arose in programming languages to enable compilers to
know what instructions to generate, but have now become very important
to allow error checking of programs, and to increase the clarity of
programs by providing implicit documentation.
The basic definition of a type is that it is a set of values with a
set of operations. The abstract definition of a type does
not specify the representation.
Type Safety Type safety or strictness or strength refers to
whether or not the language guarantees that the operations applied to
instances of types are well defined. (The literature is not entirely
consistent on the details of the definition - but the basic idea is
that the system can determine whether or not instances have
appropriate types for operations to apply).
The four languages we have studied this quarter, ML, Java, Prolog, and
LISP are all strongly typed.
Examples of weakly typed languages include:
- Assembly language. (Although one could argue that the type
system of an assembly language consists of words and bytes - and all
operations apply to words, so it is strongly typed!)
- C. Assigning pointers to void* gets around everything.
- Pascal. Almost strongly typed. However, the undiscriminated
variant records breaks the type system.
Static vs. Dynamic Type Checking Static type checking is done
at compile time, and dynamic type checking is done at run time.
A language is statically typed if the types of all expressions can be
determined at compile time. ML is an example of a statically typed
language. A big motivation for statically typed languages is that
many errors can be detected -- so the programs which survive the
compilation step have a greater chance of being correct.
- Lisp, Prolog, and Smalltalk are dynamically typed.
- Pascal would be statically typed - except for the problem with
variant records.
- Java is dynamically typed - since the types of objects need to be
evaluated at runtime to determine whether or not methods can be used.
- The checking that C does is static -- since the checking that C
does is all at compile time. C is not considered statically typed -
since it is not possible to determine the types of expressions.
Type equivalence . When are two types the same? When is one
type a subtype of another? Languages differ on these points.
This is important, since type equivalence determines when values can
be assigned or passed as parameters.
Are array types equivalent? For example, is an integer array of size
10 the same type as an integer array of size 20? Pascal says no, most
other languages (including Java and C) say yes.
When are records equivalent? Two options are structural equivalence
vs. name equivalence. A standard example is cartesian points and
polar points, which are structurally equivalent, but not name
equivalent.
Polymorphism, Overloading, Parameterized Types
- Overloading means that the same operation applies to multiple
types. The implementation used is determined by signatures of the
arguments. Java provides overloading of methods, C++ provides
overloading of both functions and operators. Many languages provide
overloading of the built in arithmetic operators.
- In Polymorphism, a single implementation applies to multiple
types. The length function on a list is a good example, where the
type of elements doesn't really matter when determining how many
elements there are. ML supports polymorphism, since the type system
allows indeterminate types. The use of Object in Java can also be
interpreted as polymorphism, since classes such as Vector do not
depend on their element types.
- Parameterized Types. Some languages allows types to be defined
based on other types - such as a stack type which can be a stack of
ints, or a stack of floats, or even a stack of stacks of ints. Ada's
generics and C++'s templates are examples. These differ from
Polymorphism in that the compiler generates different code for the
different types.
Types in Object-Oriented Languages
Most object-oriented languages are type safe (C++ is a notable exception).
In Smalltalk-80, all type checking is done dynamically.
Some object-oriented languages have static type checking, for example
Emerald, Simula, Trellis, Eiffel (except that Eiffel's type system has a
loophole).
Cecil: optional type declarations and static checking
Type = Class?
In a class-based object-oriented language, an obvious thing to do is to
make types be classes. (This is done for example in Simula, the first
object-oriented language.)
Simula example:
begin
class vehicle(destination);
comment 'text' is the name of the string datatype in simula.
the default transmission mode for text is by reference, so this is
overridden below;
value destination;
text destination;
virtual: procedure start;
begin
procedure start;
begin
outtext("vehicle going to "); outtext(destination); outimage;
end;
comment notice that croak is not virtual;
procedure croak;
begin
outtext("this vehicle just died"); outimage;
end;
end vehicle;
vehicle class car;
begin
procedure start;
begin
outtext("vroom"); outimage;
end;
procedure croak;
begin
outtext("broken radiator"); outimage;
end;
end car;
vehicle class bus(max_passengers);
integer max_passengers;
begin
procedure start;
begin
outtext("85 cents, please"); outimage;
end;
comment no procedure definition for croak in bus;
end bus;
bus class articulated_bus(max_angle);
real max_angle;
begin
procedure start;
begin
outtext("move to the back of the bus, please"); outimage;
end;
procedure croak;
begin
outtext("rear section of bus fell off"); outimage;
end;
end articulated_bus;
vehicle class train;
begin
end train;
ref(vehicle) v;
ref(car) c;
ref(train) t;
ref(bus) b1, b2;
ref(articulated_bus) a;
comment construct a new car with destination = downtown;
v :- new car("downtown");
c :- new car("airport");
t :- new train("portland");
b1 :- new bus("ballard",60);
b2 :- new articulated_bus("u-district",100,33.5);
comment invoke associated procedures to see effects;
v.start;
v.croak;
c.start;
c.croak;
t.start;
t.croak;
b1.start;
b1.croak;
b2.start;
b2.croak;
comment and now some examples of legal assignments and accesses;
v :- c;
v :- b1;
b2 :- b1;
comment the following assignment would be illegal without the 'qua';
b2 :- v qua bus;
outtext(c.destination); outimage;
outint(b1.max_passengers, 10); outimage;
comment these assignments and accesses would cause compile-time errors:
c :- v
v.max_passengers
b1.max_angle
;
comment finally, an assignment that is ok at compile time but that
will create a run-time error;
v :- new vehicle("tukwila");
c :- v qua car;
end;
Types as sets of classes
Another definition of "type" sometimes used in type inference systems: a
type is a set of classes (see e.g. Palsberg and Schwartzbach).
Abstract Types in Object-Oriented Languages
It isn't necessary, however, to identify types and classes. In some
languages, we have a notion of an abstract type, which is different from a
class.
Definition of abstract type in an object-oriented language: An abstract
type is a collection of operation signatures, where each signature
consists of the operation name, the types of the arguments, and the
type of the result(s).
We will say that an object is of an abstract type if it has the properties
defined by that type. Abstract types can allow us to do static type
checking in an object-oriented language, while preserving the dynamic
binding of names to operations. For example, we can statically check that
some object will understand a message m, even though we don't know exactly
which method will be invoked.
Example:
The type Point might be defined as follows:
Type Point
x(): Number
y(): Number
set_x(a: Number)
set_y(a: Number)
r(): Number
theta(): Number
set_r(a: Number)
set_theta(a: Number)
+(p: Point): Point
We then might have two classes, CartesianPoint and PolarPoint, both of
which implement the operations specified by the type Point.
Contravariance
The contravariant rule for subtyping is used in Emerald, Cecil, Trellis, etc.
(There is an alternative rule, covariance, which is used in Eiffel, but
it's broken.)
Contravariant rule
S is a subtype of T if:
- S provides all the operations that T does (and maybe some more)
- For each operation in T, the corresponding operation in S has the same
number of arguments and results
- The types of the results of S's operations are subtypes of the types of
the corresponding results of T's operations
- The types of the arguments of T's operations are subtypes of the types
of the corresponding arguments of S's operations
(note the reversal of T and S here)
N.B. If S=T, then S is a subtype of T.
Example
Type Color
Type GrayScaleColor (a subtype of Color)
Type Point
x(): Number
y(): Number
Type ColoredPoint
x(): Number
y(): Number
mycolor() : Color
Type GrayScalePoint
x(): Number
y(): Number
mycolor() : GrayScaleColor
GrayScalePoint is a subtype of ColoredPoint -- anywhere a ColoredPoint is
needed, we can use a GrayScalePoint. (Also ColoredPoint is a subtype of
Point, and GrayScalePoint is a subtype of point.)
p : ColoredPoint;
c : Color
p := GrayScalePoint.new;
c := p.mycolor;
q , r : Point;
a : Number;
q := ColoredPoint.new;
r := GrayScalePoint.new;
a := q.x + r.x;
now add a message with an argument:
Type ColoredPoint
x(): Number
y(): Number
mycolor() : Color
setDotSize(c : Integer);
Type GrayScalePoint
x(): Number
y(): Number
mycolor() : GrayScaleColor
setDotSize(c : Number);
The contravariant rule says that GrayScalePoint is still a subtype of
ColoredPoint.
c : ColoredPoint;
c := GrayScalePoint.new;
c.setDotSize(3);
Now consider:
Type ColoredPoint
x(): Number
y(): Number
mycolor() : Color
setcolor(c : Color);
Type GrayScalePoint
x(): Number
y(): Number
mycolor() : GrayScaleColor
setcolor(c : GrayScaleColor);
Under the contravariant rule there is no longer a subtype relation between
ColoredPoint and GrayScalePoint. Example:
c : ColoredPoint;
c := GrayScalePoint.new; /* not permitted */
c.setcolor(yellow); /* an error would occur if we tried this */
Multiple Inheritance
A type can be a subtype of several other types -- 'multiple inheritance'
for abstract types introduces no additional complications (unlike multiple
implementation inheritance).
Parameterized types
What is the type of array? We want to be able to get reasonable type
information for the element access and set messages.
Statically-typed object-oriented languages usually have some notion of
parameterized types to handle this.
a : Array[Integer]
n : Integer;
...
a[1] := 4;
n := a[1];
For a type Array, we can have something like:
Type Array[T]
at(n : Integer) : T
put(n : Integer, x : T)
Question:What is the type relation between Array[Number]
and Array[Integer]?
Answer: under the contravariant rule, none.
Design Decision: should abstract type and concrete
implementation inheritance be separate or combined? (This question
only makes sense if there are type declarations.)
Reasons for separating them:
- they are different concepts
- allows additional flexibility (for example, multiple implementations of
the same abstract type)
Reasons against:
- it makes the language more complicated
- they usually go together anyway
Example of multiple implementations (two concrete classes with one abstract
type):
abstract type Stack
implementation classes ListStack, ArrayStack
both would conform to the abstract type Stack
Example of inheriting implementation but not abstract type:
Stack inherits from Deque (just masks off extra operations like front)