Notes on Use of the Hadoop Cluster

Google and IBM operates a large compute cluster that runs Hadoop, an open source implementation of MapReduce and GFS. This cluster is used for teaching the department's Internet Scale Programming course (CSE 490H), and for independent study projects. Use of this Cluster carries some special, very specific usage requirements, and is subject to US Export Control Law (basically: it may not be used for anything that is restricted from export - or criminal penalties could ensue.) Therefore, use of this resource requires a special account on that cluster.

Access Policies

Access to the Hadoop Cluster is limited to students enrolled in courses that have been authorized to use the Cluster, or students participating in an authorized independent study project sponsored by a regular CSE faculty member. Approval for use of the Cluster by a course or for independent study must be requested by the instructor or supervising faculty member to the CS Lab Director, Erik Lundberg (lundberg A T cs).

Privacy

UW will not share your name or your UW NetID or your CSE NetID with IBM or Google. You will need to create a separate "Hadoop Account Name", which will be used for accessing the Hadoop Cluster and for accessing the Support Site. Your Hadoop Account Name will be known to Google and IBM. However, neither IBM nor Google will have a way to link your UW NetID nor your CSE NetID nor your real name to your Hadoop Account Name.

UW CSE will maintain information that links your Hadoop account name to your identity, and will record the fact that you agreed to the Guidelines for Use. The University will not knowingly reveal those records to anybody outside the University unless legally required to do so.

Create Your Hadoop Account

To get an account on the Hadoop Cluster, you must first agree to follow a set of guidelines (presented as a "click-thru" agreement), and you must also answer a set of questions to verify that the programs you write and the data you use on the cluster are not restricted by US Export Control Law.

Please visit the Authorization site (operated by UW CSE), which requires that you login with your UW NetID (not your CSE NetID). If you agree to follow and abide by the guidelines, AND your use of the Cluster does not require export control, then you will be presented with a Registration Token. Armed with your token, you will then need to visit another site (operated by IBM) to create your Hadoop Account.

  1. Visit the UW Hadoop Authorization Site to read and accept the conditions for use, and to get a Registration Token.
  2. Visit IBM's University Support Site and follow the "Sign Up" link to create your Hadoop account. You will need to use the Registration Token from Step 1, and you will need to select a Username and a "support" password.

If you are enrolled in a course that is using this cluster (according to Course Computing Resources), but you do not have access, contact support A T cs.

Cluster Login

To access the Hadoop Cluster, you will use the account name you registered in step 2 above. Your Hadoop password will be given to you at the end of Step 1. (You will NOT use the "support" password that you create in Step 2.)

Cluster Support

Support of the Hadoop Cluster is provided by IBM, via their online Support Site. You can access the site with your Hadoop Account Name, and the password you selected in Step 2.


Questions or comments to course-computing