CSE503: Software Engineering

CSE503: Software Engineering
Lecture 2 (January 6, 1999)

David Notkin

Question from an interested student:

So, something that has been bugging me recently is the need for specification of implementation in the specification of an ADT (or some such).

Take for an example a list/vector like data structure that supports indexing and insertion/deletion of arbitrary elements. So the question is, did does the structure have O(1) inserts/deletions and O(n) indexing, or O(n) insertions/deletions and O(1) inserts. Is it implemented over a linked list or an array? When I use a data structure out of the box, this is something I really want to know.

Examining two classes in the Java Hierarchy, java.util.Vector and java.util.Stack. It appears that the Stack (a subclass of Vector) uses the same implementation as Vector. I would also assume, although it is not specified (but it should be!!) that the Vector class is implemented over an array. This means that the insertion operations are costly and the indexing operations are not. Now, the Stack object is implemented as a subclass of Vector, so I would assume it has the same implementation. This means that doing N push operations on an initially empty stack costs me a runtime of order N log N, assuming that the Vector is implemented to double its size when ever needs more space than its capacity. However, in my opinion, any structure claiming to be a "stack" implemented on a system that supports dynamic allocation should have a runtime of order N for N push operations.

So, the only thing I know about this is that it doesn't seem like the typical way of specifying ADTs is good enough. Is an ADT being slower than is necessary a failure that people care about? Or is that something pushed under the carpet that the implementers don't tell their supervisors? Why is it acceptable to leave this sort of stuff out of specification?

Notkin’s Top 10 Observations

About software engineering

With apologies and appreciation to many unnamed souls

We make a huge mistake by assuming similarity among software systems

Ex: Does (and should) the reliability of a nuclear power plant shutdown system tell us much about the reliability of an educational game program?
Ex: Does (and should) the design of a sorting algorithm tell us much about the design of an event-based GUI?
So, assume differences until proven otherwise

Intellectual tools still dominate mechanical tools in importance

How you think is more important than the notations, tools, etc. that you use

Ex: Information hiding is a key design principle

Interface mechanisms can enforce information hiding decisions but cannot help one make the decisions

Ex: The notion of design patterns is more important than languages that let you encode them

Analogies to other engineering disciplines are attractive but generally fall apart quickly

One key reason is because of the incredible rate of change in hardware and software technology
Another is that software seems to be constrained by few physical laws
But I’ll make them anyway, I’m sure (and you will, too)
This is a variation on #1

It is often too easy to estimate the benefits of a "better" approach to engineering software without assessing its costs

"If only everyone only built software my way, it'd be great" is a common misrepresentation

Ex: The formal methods community is just starting to understand this

But at the same time, estimating the costs and the benefits is extremely hard, leaving us without a good way to figure out what to do

The properties that programming languages can ensure are distant from the properties we require software systems to have

Programming languages can help a lot, but they can't solve the "software engineering" problem

Ex: Contravariant type checking (such as in ML) has significant benefits, but regardless, it doesn’t eliminate all errors in ML programs
And covariant typing, with its flaws, may be useful in some situations

The total software lifecycle cost will always be 100%

Software development and maintenance will always cost too much
Software managers will always bitch and moan
Software engineering researchers will always have jobs

Software engineering draws on mathematics, cognitive psychology, management, etc., but it is engineering and not mathematics, nor cognitive psychology, nor management (nor etc.)

If somebody is talking about software without ever mentioning "software", run away

Tradeoffs are at the heart of software engineering, but we're not very good at it

Getting something for nothing is great, but it isn't usually possible
We almost always choose in favor of hard criteria (e.g., performance) over soft criteria (e.g., extensibility)
This makes sense, both practically and theoretically
Brooks’ Golden Rule doesn’t really work
But the situation leaves us up a creek to a large degree

It's always good to (re-)read anything written by Brooks, Jackson, and Parnas

Don’t fall into Mark Twain’s trap: "A classic is something everyone wants to have read, but nobody wants to read."

Software engineering researchers should have a bit of the practitioner in them, and software engineering practitioners should have a bit of the researcher in them

A very basic overview of software engineering

Software is critical to society

Economically important

Essential for running more enterprises

Key part of most complex systems

Essential for designing many engineering products

Sample code sizes (partly due to Jon Jacky)

Bar code scanners 10-50KLOC
4-speed transmissions 20KLOC
GNU Emacs 120KLOC
ATC ground system 130KLOC

GCC 280KLOC
Teller machine 600KLOC
Call router 2.1MLOC
B-2 Stealth bomber 3.5MLOC
Seawolf submarine combat 3.6MLOC
Space shuttle 26MLOC+1MLOC/flight
NT5.0 60MLOC

At 50 lines per page, double sided, 500 pages/ream (2 inches), a printed version of NT5.0 would be about 200’ tall

That’s about 33 Notkin’s high or about triple the height of the Suzzallo graduate reading room

Dominant discipline (Stu Feldman, through 10⁷)

As the size of the software system grows, the key discipline changes

Code Size Discipline

10³ Mathematics

10⁴ Science

10⁵ Engineering

10⁶ Social Science

10⁷ Politics

10⁸ ??

Delivered source lines per person

Common estimates are that a person can deliver about 1000 source lines per year

Including documentation, scaffolding, etc.

Obviously, most complex systems require many people to build
Even an order of magnitude increase doesn’t eliminate the need for coordination

Inherent & accidental complexity

Brooks distinguishes these kinds of software complexity

We cannot hope to reduce the inherent complexity
We can hope to reduce the accidental complexity

It’s not always easy to distinguish between these kinds of complexity

"The Software Crisis"

We’ve been in the midst of a "software crisis" ever since the 1968 NATO meeting

crisis -- (1) an unstable situation of extreme danger or difficulty; (2) a crucial stage or turning point in the course of something [WordNet]
We cannot produce or maintain high-quality software at reasonable price and on schedule

Gibb’s Scientific American article
"Software systems are like cathedrals; first we build them and they we pray" —Redwine

My view—"mostly hogwash"

Given the context, we do pretty well
We surely can, should and must improve
Some so-called software "failures" are not

They are often management errors (Ariane, Denver airport, etc.)
Read comp.risks (far better than comp.software-eng)

In some areas, we may indeed have a looming crisis

Safety-critical real-time embedded systems
Y2K?

Some "crisis" issues

Relative cost of hardware/software
Low productivity
"Wrong" products
Poor quality

Importance depends on the domain

Constant maintenance

"If it doesn’t change, it becomes useless"

Technology transfer is slow

SE <> PL
Why is SE hard?

There is no single reason software engineering is hard—it’s a "wicked problem"
Lack of well-understood representations of software [Brooks] makes customer and engineer interactions hard
Relatively young field
Software intangibility is deceptive

Law XXIII, Norman Augustine [Wulf]
"Software is like entropy. It is difficult to grasp, weighs nothing, and obeys the second law of thermodynamics; i.e., it always increases."

Is it engineering?