CSE 451: Operating Systems, Spring 2011
  CSE Home   About Us   Search   Contact Info 
 
Course Home
Home
Administrivia
Overview
Course email
 
Materials
Course Calendar
Sections
goPost Board
Anonymous feedback
View feedback
 
Assignments
Homework
Projects
Turnin
Gradebook
 
Information
Home Virtual Machine
Linux information
   

Project 2: factory.h in Detail

You shouldn't need the information on this page to complete the project. The briefer explanation of factory.h on the main assignment page should be enough to use it. However, if you run into trouble, or are just generally interested in what is happening, this page is for you.

Goal #1: Imitating Classes

The first goal of factory.h is to imitate classes in C code. Consider this bit of (what I hope is) Java:

    public class base {
       private int count;
       public base()          { count = 0; }
       public int increment() { count++; }
       private void helperFn() {}
    }
    public class foo extends base {
       private String name;
       public foo(String myname) { name = myname; }
       public String getName()   { return name; }
    }
    public class bar extends base {
       private float price;
       public bar()            { price = 0.0; }
       public float getPrice() { return price; }
    }
    ...
    foo myFoo = new foo("my foo");
    String s = myFoo.getName();
    myFoo.increment();

    bar myBar = new bar();
    float p = myBar.getPrice();
    myBar.increment();
To implement this in C, we need to content with the fact that C doesn't have classes.
  1. No classes means no subclasses.
    We get around that by injecting the code and instance variables of the parent class directly into the subclasses, like this:
    
        public class foo {
           private int count;
           private String name;
           public foo(String myname) { count=0; name = myname; }
           public String getName()   { return name; }
           public int increment()    { count++; }
           private void helperFn()   {}
        }
        public class bar {
           private int count;
           private float price;
           public bar()            { count = 0; price = 0.0; }
           public float getPrice() { return price; }
           public int increment()  { count++; }
           private void helperFn() {}
        }
    

  2. No classes means no classes.
    We'll represent an instance of a class by a struct. Here I'll be migrating from Java to C, so this is a mix, resulting in something that is neither quite yet:
    
        typedef struct foo {
           private int count;
           private String name;
        } * foo;
        foo(String myname) { count=0; name = myname; }
        String getName()   { return name; }
        int increment()    { count++; }
        private void helperFn() {}
    
        typedef struct bar {
           private int count;
           private float price;
        } * bar;
        bar()            { count = 0; price = 0.0; }
        float getPrice() { return price; }
        int increment()  { count++; }
        private void helperFn() {}
    

  3. C doesn't support polymorphic methods.

    Note that there are now two methods named increment in the global method name space. To resolve that, change the names to something class specific:

    
        typedef struct foo {
           private int count;
           private String name;
        } * foo;
        foo(String myname)  { count=0; name = myname; }
        String getName()    { return name; }
        int foo_increment() { count++; }
        private void foo_helperFn() {}
    
        typedef struct bar {
           private int count;
           private float price;
        } * bar;
        bar()               { count = 0; price = 0.0; }
        float getPrice()    { return price; }
        int bar_increment() { count++; }
        private void bar_helperFn() {}
    

  4. C doesn't support new, and in particular doesn't support the notion of a constructor.
    Resolve that by forcing the subclass implementer to separate subclass initialization from object creation:
    
        typedef struct foo {
           private int count;
           private String name;
        } * foo;
        foo foo_acquire()       { count=0; }
        void myFooInitializer(String myname) { name = myname; }
        String getName()        { return name; }
        int foo_increment()     { count++; }
        private void foo_helperFn() {}
    
        typedef struct bar {
           private int count;
           private float price;
        } * bar;
        bar bar_acquire()       { count = 0; }
        void myBarInitializer() { price = 0.0; }
        float getPrice()        { return price; }
        int bar_increment()     { count++; }
        private void bar_helperFn() {}
    
    Code like bar myBar = new bar(); now becomes bar myBar = bar_acquire(); myBar.myBarInitializer();

  5. No classes means no objects, and no objects means no method invocation on objects.
    You can't say myBar.myBarInitializer() in C. Since the method names are now all unique, we don't need the object to understand what code to invoke, the name alone sufficies. We still need the reference to the object once we get to that code, though. To achieve that, we explicitly pass the object reference as the first argument:
    
        typedef struct foo {
           private int count;
           private String name;
        } * foo;
        foo foo_acquire()           { foo this = malloc(sizeof(*foo); this->count=0; return this; }
        void myFooInitializer(foo this, String myname) { this->name = myname; }
        String getName(foo this)    { return this->name; }
        int foo_increment(foo this) { foo->count++; }
        private void foo_helperFn(foo this) {}
    
        typedef struct bar {
           private int count;
           private float price;
        } * bar;
        bar bar_acquire()           { bar this = malloc(sizeof(*bar); this->count = 0; return this; }
        void myBarInitializer(foo this)  { this->price = 0.0; }
        float getPrice(foo this)    { return this->price; }
        int bar_increment(foo this) { this->count++; }
        private void bar_helperFn(foo this) {}
    
    Now instead of saying myBar.myBarInitializer(), you instead say myBarInitializer(myBar).

  6. No public/private in C.
    Well... it has a crude form of these things. For methods, the keyword static means "local to this file". The same thing applies to global variables, but isn't relevant here because we don't have any global variables. What we do have are "the private instance variables of the base class", count and price inside the foo and bar structs. We can (effectively) make them private by not exposing them to the client code. We do that by breaking the code I've been showing into a part that goes into a .h file and a part that goes into a .c file. Only the part in the .h file is available to client code.

    Doing those things for the foo class, we have:

    foo.h

    
      // Part A
      typedef struct foo* foo;  // says type 'foo' is a pointer to a struct foo
                                // without giving the details of what's in a struct foo
      extern foo foo_acquire(); // it's okay for client code to call that...
      extern int foo_increment(foo this); // okay for them to call that as well
    

    foo.c

    
        // Part B
        typedef struct foo {
           int count;
           String name;
        } * foo;
    
        // Part C
        foo foo_acquire()           { foo this = malloc(sizeof(*foo); this->count=0; return this; }
        int foo_increment(foo this) { foo->count++; }
        static void foo_helperFn(foo this) {}
    
        // Part D
        void myFooInitializer(foo this, String myname) { this->name = myname; }
        String getName(foo this)    { return this->name; }
    
    Notice that parts A, and C can be generated automatically if we know the name of the class (foo) and the definition of the original base class. factory.h implements the latter (so knows it). The user supplies the former. Part B additionally requires information about the instance variables of the subclass. The user supplies those as well.

    The C preprocessor is basically a string-to-string translator, so we use that to generate the code, like this:

    foo.h

    
      // Part A
      EXPOSETYPE( foo )
    

    foo.c

    
        // Part B
        INSTANTIATETYPE( foo, char* name;  )  // Java's String translated by user to C's char*
    
        // Part C
        CREATEMANAGER( foo )
    
        // Part D -- subclass code, written by user
        void myFooInitializer(foo this, String myname) { this->name = myname; }
        String getName(foo this)    { return this->name; }
    
    The capitalized words are preprocessor macros, the definitions of which are in factory.h. When the preprocessor encounters those words in a file, it replaces the line they're on with the definition of the macro, taken from factory.h. The preprocessor macro facility is flexible enough to allow some string substitutions, which is used here to customize the result of the macro to strings that insert the type name, for instance, in various places.

    Note: Technically, this scheme doesn't make the instance variables of the base class private, since they're exposed in the .c file containing the subclass code. They're protected, though, which is about as well as we can do.

Goal #2: What the Base Class Does

The main purpose of the base class is garbage collection -- the (mostly) automatic process of determining when some previously allocated memory can no longer be used. In C, memory that has been malloc'ed must be free'd when it becomes garbage. Free'ing too soon is a bug, with random but usually catastrophic symptoms. Free'ing too late is a memory leak; by definition, the program no longer has any way to address garbage (because it no longer has any pointers to it), and so won't ever be able to free it.

In reference counting, a count of the number of pointers pointing to an object is kept as an instance variable in the object. For instance, suppose I say:


foo myFoo = foo_acquire();  // returns a foo (which is a pointer to a struct)
myFoo has a reference count that is initialized to 1.

If I now say:


foo currentFoo = foo;
foo_ref( currentFoo );
I (a) create another reference to the object (stored in currentFoo) and alert the refernece counting system to that by invoking the foo_ref method on the object. It's reference count is now 2.

I might now lose the pointer stored in currentFoo, because I store something else there. For example:


foo_release( currentFoo );
currentFoo = NULL;
The foo_release() call decrements the refernce count. If it drops to 0, the object is garbage, and is free (logically -- see what follows). Otherwise, there are still pointers to it, and so it isn't free'd. In this case the object isn't free'd, because its reference count after decrementing is 1.

Now I might do this:


foo_decrement( myFoo );
myFoo = NULL;
The reference count drops to 0, and the object is free'd.

Note that you must be careful to (a) make the ref and release calls each time you create/destroy a pointer to an object, and (b) must make those calls in the right order, relative to actually losing the pointer. If the reference count drops to 0 the object will be free, even if you actually still have a pointer to it.

Determining when to free an object in Project 2 can be very hard. There can be many pointers to it, held by many threads. The threads drop their reference to the object asynchronously with each other, so you have no way to know statically which one will be last, and so no way to know where in the code the free should occur. To address this, you need some garbage collection scheme, like reference counting. factory. provides that, which is easier than building it yourself.

The second memory allocation-related function of factory.h is a performance issue. malloc and free are kind of slow. A commonly used scheme to address that is to cache deallocated objects rather than to actually free them. These logically free'd objects are kept in a list. When the application wants to create one, it first checks if there is one on the cached list. If so, one is taken from there. If the list is empty, malloc is used to create a new one.

Implementation Details

Here's an example that shows the code actually produced (and so the code actually executed) for a subclass, myClass, that has a two int instance variables. (gcc -E was used to obtain what the preprocessor produces.)

File myClass.h has this:


EXPOSETYPE( myClass )

extern int setOne(myClass this, int val);
extern int getOne(myClass this);

extern int setTwo(myClass this, int val);
extern int getTwo(myClass this);
and becomes (after preprocessing) this:

typedef struct myClass* myClass;
extern int myClass_ref( myClass foo );
extern myClass myClass_acquire();
extern int myClass_release(myClass, void (*destructor)(myClass));

extern int setOne(myClass this, int val);
extern int getOne(myClass this);

extern int setTwo(myClass this, int val);
extern int getTwo(myClass this);

File myClass.c has this:


#include "myClass.h"

INSTANTIATETYPE( myClass, int myClassInstanceVar;
                          int anotherMyClassInstanceVar;
       )

CREATEMANAGER( myClass );

int setOne(myClass this, int val) {
  this->myClassInstanceVar = val;
  return val;
}

int getOne(myClass this) {
  return this->myClassInstanceVar;
}

int setTwo(myClass this, int val) {
  this->anotherMyClassInstanceVar = val;
  return val;
}

int getTwo(myClass this) {
  return this->anotherMyClassInstanceVar;
}
and becomes (leaving out the expansion of #include "myClass.h"):

struct myClass { myClass ar_next;
                 int ar_refCnt;
                 int ar_state;
                 pthread_mutex_t ar_lock;
                 const char* ar_typename;
                 int myClassInstanceVar;
                 int anotherMyClassInstanceVar;
               };

typedef struct { myClass head;
                 pthread_mutex_t lock;
                 int isRegistered;
               } myClass_manager;
myClass_manager myClass_Manager = { ((void *)0), { { 0, 0, 0, 0, 0, 0, { 0, 0 } } }, 0};
void myClass_Manager_destroy() {
   myClass el;
   pthread_mutex_lock( & myClass_Manager.lock );
   while ( (el= myClass_Manager.head) ) {
      myClass_Manager.head = el->ar_next;
      pthread_mutex_destroy(&el->ar_lock);
      free(el);
   }
  pthread_mutex_unlock( & myClass_Manager.lock );
}
myClass myClass_acquire() {
  pthread_mutex_lock( & myClass_Manager.lock );
  if ( ! myClass_Manager.isRegistered ) {
    atexit( myClass_Manager_destroy );
    myClass_Manager.isRegistered = 1;
  }
  myClass foo;
  if ( ! myClass_Manager.head ) {
    foo = (myClass)malloc(sizeof(struct myClass));
    pthread_mutex_init(&foo->ar_lock, ((void *)0) );
    foo->ar_typename = "myClass" ;
  }
  else {
    foo = myClass_Manager.head;
    myClass_Manager.head = foo->ar_next;
  }
  pthread_mutex_unlock( & myClass_Manager.lock );
  foo->ar_refCnt = 1;
  foo->ar_state = 1;
  return foo;
}
void myClass_typeerror( const char* methodname, myClass foo ) {
  fprintf(stderr, "Fatal error: called %s on something of type %s\n", methodname, foo->ar_typename);
  fprintf(stderr, "Causing a fatal error so you can see backtrace if running gdb.\n");
  fflush(stderr);
  asm("int $0x3\n");
  int* j = 0;
  int i = *j;
  fprintf(stderr, "%d", i);
}
int myClass_ref( myClass foo ) {
  if ( strcmp(foo->ar_typename, "myClass" ) ) myClass_typeerror( "myClass" "_ref", foo);
  pthread_mutex_lock(&foo->ar_lock);
  if ( !foo->ar_state ) {
    fprintf(stderr, "myClass" "_ref: Fatal error\n");
    fprintf(stderr, "Object seems to be on free list!\n");
    fprintf(stderr, "Current reference count = %d\n", foo->ar_refCnt);
    fprintf(stderr, "Causing a fatal error so you can see backtrace if running gdb.\n");
    fflush(stderr);
    int* j = 0;
    int i = *j;
    fprintf(stderr, "%d", i);
  }
  int val = ++foo->ar_refCnt;
  pthread_mutex_unlock(&foo->ar_lock);
  return val;
}
int myClass_release( myClass foo, void (*destructor)(myClass) ) {
  if ( strcmp(foo->ar_typename, "myClass" ) ) myClass_typeerror( "myClass" "_release", foo);
  pthread_mutex_lock(&foo->ar_lock);
  if ( !foo->ar_state ) {
    fprintf(stderr, "myClass" "_release: Fatal error\n");
    fprintf(stderr, "Object seems to be on free list!\n");
    fprintf(stderr, "Current reference count = %d\n", foo->ar_refCnt);
    fprintf(stderr, "Causing a fatal error so you can see backtrace if running gdb.\n");
    fflush(stderr);
    int* j = 0;
    int i = *j;
    fprintf(stderr, "%d", i);
  }
  int cnt = --foo->ar_refCnt;
  pthread_mutex_unlock(&foo->ar_lock);
  if ( cnt == 0 ) {
    if (destructor) destructor(foo);
    foo->ar_state = 0;
    pthread_mutex_lock( & myClass_Manager.lock );
    foo->ar_next = myClass_Manager.head;
    myClass_Manager.head = foo;
    pthread_mutex_unlock( & myClass_Manager.lock );
  }
  return cnt;
};

int setOne(myClass this, int val) {
  this->myClassInstanceVar = val;
  return val;
}

int getOne(myClass this) {
  return this->myClassInstanceVar;
}

int setTwo(myClass this, int val) {
  this->anotherMyClassInstanceVar = val;
  return val;
}

int getTwo(myClass this) {
  return this->anotherMyClassInstanceVar;
}

Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to zahorjan at cs.washington.edu]