CSE550 Problem Set #2

out: Monday October 17th, 2011
due: Friday November 4th, 2011 by 5:00pm.

[ summary | part a | part b | how to submit | grading ]

Summary.

For this problem set, you will be implemented a simple write-ahead log to add basic transaction support to a key-value store, and you will be solving a small design exercise. For the write-ahead log, you can use any language you like.

Part A -- implement simple transactions with a write-ahead log

Your job is to implement augment an existing key-value store to give the programmer simple support for transactional commit of multiple key/value updates. Of the ACID properties, the only ones we care about for this assignment are A and D: you have to make sure once a transaction commits, it is durable on disk, even after failure, and you have to make sure that either the entire transaction commits or none of it does. However, you do not need to worry about concurrency control; you can assume only a single transaction at a time occurs.

To do this, you will need to implement a write-ahead logging module. Your module will implement a NO-STEAL / NO-FORCE buffer management policy:

You need to find an existing key-value storage library implemented in the language of your choice. You can assume a few things about the key-value library to make your life easier -- (a) that it supports atomic updates of single keys (i.e., if a crash happens in the middle of a write to a key, you can assume that the entire write happened, or none of the write happened -- you don't need to worry about inconsistent state from a partial key write), and (b) that it has its own mechanism for durability / recovery after a crash. All you need to add is transactional support (atomicity for multiple actions), and post-crash recovery. You can also assume that mutations are idempotent -- it's OK to replay log entries multiple times, as long as they are replayed in the right order.

One issue is that you will need to figure out which subset of actions in your log to replay after a crash; this could mean you need to keep track of which committed transactions have been written into the key/value store, and which haven't. (Checkpoint records help with this.)

Don't worry about pruning the log, or hyper-optimizing performance. Assume your language supports a "sync" operation that forces any buffered file system writes to disk; use this operation if your language actually does support it, though!

Include a brief discussion of design issues that came up in a README.TXT file, as well as a rough overview (just a few sentences) of what your code does.

Part B -- answer a few questions

Write one or two paragraphs to answer each of the following:

What to turn in

When you're ready to turn in your assignment, do the following:

  1. Create a directory called "problemset2". In it, you should have three things: a subdirectory called "parta", a file called "partb.txt", and a README.TXT file.

  2. The README.TXT file should contain your name, student number, and UW email address, as well as instructions on how to launch your server.

  3. "parta" should contain your part a code, as well as a Makefile for compiling it. Running "make" should produce an executable called "waltest." (If you implement in a scripted language, waltest can be your main script file we run.) When we run waltest, it should issue (sequentially) a set of transactions against the key-value store. If there is anything we need to know about setting things up, include that in your README.TXT file. Your code should be clean, well commented, etc.

  4. "partb.txt" should contain your answers to the three questions in part b. Remember, just one or two paragraphs suffices for each question.

  5. Create a submission tarball by running the following command, but replacing "UWEMAIL" with your email account name:
    tar -cvzf problemset2_submission_UWEMAIL.tar.gz problemset2
    For example, since my email account is "gribble", I would run the command:
    tar -cvzf problemset2_submission_gribble.tar.gz problemset2

  6. Use the course dropbox (there is a link on the course homepage) to submit that tarball.

Grading

We will be basing your grade on several elements: