CSE403 Software Engineering, Autumn 1999

David Notkin

 

Lecture #8 Notes

Information Hiding

 

1)      Objectives of today’s lecture

a)      Understand information hiding as a principle for decomposing systems into modules

b)      Be able to distinguish module decompositions that are based on information hiding from those that aren’t

2)      Background (reprise)

a)      “I assume the programmer’s genius matches the difficulty of his problem and assume that he has a arrived at a suitable subdivision of the task.” —Dijkstra

b)      “Usually nothing is said about the criteria to be used in dividing the system into modules.” —Parnas

3)      Information hiding principle, motivation

a)      A fundamental cost in software engineering is accommodating change

b)      A change that requires modifying multiple modules is more costly than a change that is isolated in a single module

c)      Therefore

i)        Anticipate likely changes

ii)       Define interfaces that capture the stable aspects and implementations that capture the changeable aspects

4)     Small examples

a)      double sqrt (int)

b)      Can be implemented using bisection methods, factoring methods, Newton’s method

c)      The client doesn’t care, and this can change (requiring only relinking—and not even that for some dynamic linking systems)

d)    Very low level example, of course

e)     An historical aside: what was the original goal of procedures (with parameters)

i)       Most people answer, “For this kind of abstraction, of course!”

ii)     It’s not true: the original goal was to save memory, which was the most precious resource

5)  Another simple example

a)  type intSet is
  intSet create();
  insert(intSet,int);
  delete(intSet,int);
  bool member(intSet,int);
  int size(intSet);
end intSet;

6)      Hiding secrets

a)      These two examples show specific kinds of secrets that modules hide

b)      Algorithms

c)      Data representations

d)      The interfaces capture stable decisions

e)      Clients depend on these interfaces

f)        The implementations encode the changeable parts

g)      Clients do not depend on these

7)      Interface

a)      An interface has two parts

b)      The signature: the names and type information about the exported functions

c)      The specification: a precise description of the semantics of the elements in the module

d)      Most commonly, the signature is in a programming language and the specification is in natural language

e)      But you cannot neglect the specification

8)      Examples

a)  double sqrt (int x) {
   a legitimate
   different
   implementation
}

b)  double sqrt (int x) {
   return 3.14159;
}

c)  bool member
 (intSet s,int i) {
    return IsOdd(i)
}

d)  Ridiculous examples, you say?

e)  Sorry, that’s not true (although these examples are indeed extreme)

f)  At the very least, many assumptions are made when interfaces are not fully defined

9)      Design Level

a)      Information hiding is a design principle, not a coding principle

b)      Obviously, it can be reflected in code that is based on the design

10)  Anticipating change

a)      It’s “easy” to anticipate algorithmic and representational changes

b)      But you cannot and should not do this and only this

c)      By blithely anticipating these changes, you may not think about another kind of change that is more likely and potentially costly

d)      In general, you cannot build a design that effectively anticipates all changes

i)        A standard (albeit weak) analogy is that you can’t make everything in a car engine easily accessible

ii)       It’s expensive to replace a clutch not because it’s inherently hard, but rather because you have to yank out lots of the engine to get it

iii)     This is intelligent design, because clutches tend to last a long time, so making them expensive to replace is OK

iv)     Making it expensive to replace an oil filter would not be sensible

11)  Data isn’t always abstracted

a)      Unix byte streams are pervasive

b)      imagine trying to change Unix’s data model from byte streams to fixed width records

c)      good or bad decision?

12)  y2k problems arise because a date representation was exposed

a)      The USPS, the DJIA McDonald’s, and Europe (for the Euro currency) have also faced similar problems

b)      Parnas argues strenuously that the y2k problem shouldn’t have happened, since the programs could easily be protected from the representation decisions

c)      Other examples?

13)  Other kinds of secrets

a)      An information hiding module can hide other secrets

b)      Characteristics of a hardware device

c)      Ex: whether an on-line thermometer measures in Fahrenheit or Centigrade

d)      Where information is acquired

e)      Ex: the Metacrawler (www.metacrawler.com) might hide what other web search engines it uses

f)        Other examples?

14)  KWIC: the classic example

a)     Input

i)    now is the time
for all good students
to come to the aid
of their professors

b)    Output

i)     aid to come to the
all good students for
come to the aid to
for all good students
good students for all
is the time now
now is the time
of their professors
professors of their
students for all good
the aid to come to
the time now is
their professors of
time now is the
to come to the aid
to the aid to come

15)  The classic decomposition (see details in paper, first figure at end of this document)

a)      Top-down functional decomposition

b)      Stepwise refinement

c)      Based on the steps the actual computation will take

16)  The data decomposition  (see details in paper, second figure at end of this document)

a)      Not based on the actual computation steps

b)      Hides decisions about data representation

c)      Could they be hidden in the previous decomposition?

d)      Hides decisions about the granularity of sorting

e)      The “sequence” relationship is hazier

17)  If there is time, work in small groups to define an appropriate date interface that would handle the y2k issues

a)      That is, the client interface works for all dates

b)      The implementation can (choose to) represent its data using two bytes

i)