Click here for the paper. (if no paper appears, then it's because no paper is available in an online form).
I don't have the main paper for the ISIS reading available on-line yet. But, a pretty good paper describing ISIS's Process Group Approach can be found here.
Another paper of interest is one published by Skeen and Cheriton in the 93 SOSP. They complain about CATOCS. The complaints are interesting. Paper is here. My slides on this are here.
(Message inbox:521) Replied: Wed, 07 Sep 1994 07:06:45 PDT Replied: birman@dag.uni-sb.de Return-Path: birman@dag.uni-sb.de Delivery-Date: Tue, 06 Sep 1994 22:35:50 PDT Return-Path: birman@dag.uni-sb.de Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.2.4]) by whistler.cs.washington.edu (8.6.8/7.2ws+) with ESMTP id WAA22344 for; Tue, 6 Sep 1994 22:35:47 -0700 From: birman@dag.uni-sb.de Received: from uni-sb.de (uni-sb.de [134.96.7.230]) by june.cs.washington.edu (8.6.9/7.2ju) with SMTP id WAA08714 for ; Tue, 6 Sep 1994 22:35:45 -0700 Organization: Universitaet des Saarlandes D-66041 Saarbruecken, Germany Received: from octavie.dag.uni-sb.de with SMTP by uni-sb.de (5.65++/UniSB-2.2/940830) id AA09123; Wed, 7 Sep 94 07:35:36 +0200 Received: by octavie.dag.uni-sb.de with SMTP; Wed, 7 Sep 94 07:35:38 +0200 Date: Wed, 7 Sep 94 07:35:22 +0200 Message-Id: <9409070535.AA02695@angelina.dag.uni-sb.de> Received: by angelina.dag.uni-sb.de; Wed, 7 Sep 94 07:35:22 +0200 To: bershad@cs.washington.edu Subject: Re: recent article in IEEE Computer (Or was it CACM?) Brian, With all the copyright issues I am generally a little careful about leaving things online. But the paper was in the Dec. 1993 CACM and is reprinted in the book RVR and I put out (with the figure fixed). I am very far offline right now, at Dagstuhl, but in fact I did leave the .ps for this file on ftp.cs.cornell.edu in the pub area, either called TR-93-xxxx.ps.Z or perhaps in some sensible looking symbolic link to that name, probably in the pub/isis area. The paper itself was called "The process group approach to reliable distributed computing" and I agree that it made more sense than most Isis papers prior to it... If you have a student hunt for the paper they should get the TR number from the Cornell TR list in the home page and if WWW cooperates the paper may still be online through the web. But if not, and if they note the TR number, the file is probably just lying there. It might be TR91 or TR92 (92-1296?) but with that screwed up figure I think I convinced myself to leave the TR copy around for people who don't like to cut and paste. The CACM copy itself has a version of Figure 4 that the CACM people decided to "correct and simplify" for me. Bless their hearts... Ken (Message inbox:3226) To: cs552@cs Subject: The CATOCS Response Date: Thu, 23 Jan 1997 10:36:13 PST From: Brian Bershad I mentioned the response paper from the Isis group to the paper I presented today. I'll have handouts of the OSR paper made. Robbert Van Renesse (from the ISIS group) has this to add. ------- Forwarded Message Date: Thu, 23 Jan 1997 12:23:26 -0500 From: Robbert VanRenesse To: bershad@cs.washington.edu Subject: Re: tounge in cheek response to Cheriton's SOSP paper. In retrospect, I think both our responses were wrong. The right response is really quite simple: *) the end-to-end argument provides safety in the face of at most a single failure in an unreplicated setting: in case of the famous file server example it makes sure that the file is stored uncorruptedly on the server's disk. It does not deal with liveness--if the server is dead, no file is written. Two failures can cause a correct checksum to be sent to the client with a corrupted file on disk. *) using replication and voting, no end-to-end argument is required. Not only does it tolerate t failures with 2t + 1 replicas, it does both safety and liveness. However, you need to make sure that updates are done in the same order at all replicas. *) if you use active (non-centralized) replication, you need a total ordering protocol of some kind. Two-phase commit + two-phase locking will give you this, but it is unnecessarily complex and blocks more often than necessary. *) if you use passive replication (primary backup), you need to make sure when you roll-over from a crashed primary to the backup, that updates from the old primary are not delayed beyond updates from the new one. This is exactly the causal ordering requirement. Could you please forward this to your students, in addition to the response that I wrote. By the way, both my and Cooper's (and Ken's) responses were published in ACM SIGOPS OSR vol 28 no 1 (Jan 94). Another response by Santosh Shrivastava was published in OSR vol 28 no 4 (Oct 94).
Note: some of the dept's UNIX machines won't display this postscript. It will print and it can be viewed from any NT machine however.
Click here to see what other people in the class had to say about the paper.