Fishnet Assignment 5: The Web and Sockets
Due: Monday Dec. 3, 2001 at the beginning of class. Out: Monday Nov. 19, 2001.
CSE/EE461 Autumn 2001; Wetherall.
In this assignment, you will work in teams of two to integrate external Web browsers and servers into your existing Fishnet node. The goal of this assignment is for you to understand network applications and socket programming. "Sockets" are the programming interface used most often to write most real-world distributed applications.
- What You Need To Write
Write a C program in a single file called hw5.c that does the following:
- Builds on top of the functionality from the first four assignments.
- Adds support for a Web server application associated with the node. What this means is that when your node receives Fishnet transport connection requests for destination port 80, then it should connect them to a real external Web server and return data sent by the Web server over the corresponding Fishnet transport connection. This is described in more detail below. It entails splicing together a Fishnet transport connection and a TCP connection via the socket programming interface.
- As well as the Web server support above, surfing the Fishnet with existing browsers such as IE or Netscape requires that we capture real external Web browser requests and connect them to Fishnet nodes. We are supplying you with this code (wrasse) and instructions for using it – you only need to write the server side, not the client side. See the section below for information on configuring your browser and using wrasse.
- To connect a Fishnet node to your Web server we assume that you have access to a place from where you can serve Web pages such as your home page. This place has a server hostname (e.g., www.cs.washington.edu) and a root prefix on that server (e.g., /education/courses/cse461/CurrentQtr/). You need to handle three different kinds of functionality: connections, data relaying, and HTTP manipulation. Each of these is described in turn below, from the point of view of your node on the server side.
- Connections
. Connections are identified by four tuples as before (source and destination node addresses and ports) and are set up and torn down as before using SYN and FIN packets. Your job is to splice together Fishnet connection handling with TCP connection handling. You must also support MAX_CONNECTIONS concurrent connections. When you receive a packet with the SYN flag turned on for a new connection to destination port 80 (the Web server), you should make a new TCP connection to your captive Web server. When the new TCP connection is established you should send a Fishnet acknowledgement to establish the Fishnet connection. When the Web server closes the TCP connection and all data has been reliably transferred, you should send a Fishnet packet with the FIN flag set and teardown the Fishnet transport connection as before.
- Data Relaying and Reliability.
Once the Fishnet and TCP connections are established, you relay data between them, ensuring that it is delivered reliably. What this means is that as you receive Fishnet data packets you send that data over the TCP connection as well as acknowledge them; TCP will take care of delivering the data reliably. When you receive data from the TCP connection you send that data over the Fishnet connection; you must ensure it is delivered by using timers and retransmissions as needed. Note that you will need to implement data transfer in both directions over a single connection, i.e., both sequence number and acknowledgement fields are needed in each direction.
- Web Application.
The Web application should perform HTTP downloads for URLs of the form http://<nodeaddress>/<page.html>, e.g., http://8/index.html. HTTP refers to the real-world protocol used between browsers and servers to download HTML and other Web content. Downloads should progress along the following chain. First, your browser must send its HTTP request to one of our Fishnet nodes running the wrasse executable. You accomplish this by configuring your browser to use the node as a proxy server; after this, your browser can only be used to surf internal to the Fishnet. See the information in the next section. Once the wrasse node receives a proxy request, it will examine the URL to determine the Fishnet node to contact. It will them set up a Fishnet transport connection to that node – which is the node you write – and relay the request to it in the first data packet. Your node will see HTTP headers beginning with, for example, "GET index.html". The key characteristic is that after the first space is the name of the page without the node address, which was stripped off by wrasse. You must munge this request before relaying it over the TCP connection to the Web server by pre-pending it with your root prefix, e.g., changing "GET index.html" to "GET / education/courses/cse461/CurrentQtr/index.html". By this stage the request has traveled from external browser, to the wrasse entry Fishnet node, to your final Fishnet node, to the external Web server. After you send this munged request data, you can assume that it is safe to read and relay data sent back by the Web server until the entire download is complete, at which stage you tear down the connections.
- Your program should print the output during Web downloads that is the same as during file transfers in the previous assignment. Print the URL that is visible to your program instead of the filename.
- Development and Test Instructions
Here is a suggested set of tasks to help you break down and understand the required functionality.
- Understanding HTTP
. Try downloading a page from your Web server "by hand". Instead of using a browser, use the telnet Unix command to make a TCP connection to the Web server port on the Web server, e.g., "telnet server 80". Then type the HTTP download command as though you were the browser sending the request, e.g., "GET /homes/djw/mypage.html\n\n". This is the kind of request that will arrive at your Fishnet node in the first data packet. After typing, you should observe the raw response of HTTP headers and HTML source being sent from the server and then the connection being closed by the server. Once you have done this, you should have a good idea of what is happening during Web surfing at a very low level.
- Configuring your browser.
You must configure your browser in two ways. Set it up to use the wrasse node as a proxy by entering the host on which your wrasse node is running and the proxy port on which you told wrasse to listen. This will make your browser contact the wrasse node for every download, rather than the destination server. Also configure your browser to disable Web document caching. This ensures that every time you click on a link a Web request will be generated, making the setup easier to understand.
- Running wrasse
. Get wrasse running and working with your Web browser and Web server. Wrasse implements both the client side (which you do not have to write) and the server side (which you do) in one program. So you should be able to use your browser, wrasse, and Web server to download a page with a URL of the form, e.g., http://8/mypage.html. To determine the command line arguments for wrasse, run it and look at the error message. In addition to the usual Fishnet arguments you will need to tell it what TCP port to listen on for browser requests, and what Web server and root prefix to contact to provide Web pages that are served with URLs pointing directly at the wrasse node. You will also note that instead of a Web server and prefix you can tell wrasse to proxy the fishhead. This will cause downloads from http://<wrasse-node-address> to show a Fishnet entry page with one link to the top level of every node currently alive in the Fishnet. This provides an easy way to surf the Fishnet when you have your program working. To make sure you understand this, try running your own Fishnet with two wrasse nodes, one serving content from your Web pages and the other proxying the fishhead from where you start surfing.
- Improve your code to handle multiple concurrent Fishnet connections and data transfer in both directions. You should be able to sanity check this code by using the multiple sendfile commands from the previous assignment.
- Write the code that munges a request to add the prefix. You could put this in your Fishnet node, invoke it via a Web browser, print out the munged request, and return a canned reply, e.g., "<HTML><BODY><H1>I was here!</H1></BODY></HTML>" that should appear on your Web server.
- Write the code that makes a TCP connection to the Web server, sends the request, reads the response, and closes the connection. Make sure you read the socket handout before doing this. This code is logically quite separate from much of your Fishnet code. To test it, you could get all of this functionality running as a separate program, and have it send a canned request and print out the response. The effect should be similar to using telnet to observe the underlying HTTP behavior.
- Put all the pieces together and download pages over your local Fishnet.
- You’re done! Read and do any turnin work now.
- For Fun
. Join the class Fishnet and surf it. We now have a complete, running network.
- For Fun
. If you really want to learn about sockets, implement the client side that is in wrasse too. You will need to have fish_main() upcall to your code when a new request arrives from the browser. There is a special fishnet API function in fish.h, fish_internal_readhook() that registers a function to be called when a given socket becomes readable.
- Turn In and Discussion Questions
- Turn in your program source code using turnin, as well as a printed copy for us.
- (14 points for the program above and its output here.) Use wrasse and your node to download the reference Web page that we will tell you about. Save the output of your program during the download, print it and turn it in to us.
- Question (2 points) Tell us how layering was able to simplify your program development. Tell us where layering did not translate into simple modularity.
- Question (2 points) Given that both your node and TCP/IP in the kernel are written in C and ignoring small implementation details, what protocol design factors cause surfing over the Fishnet to be slower than surfing the Web?
- Question (2 points) Compare the socket API to the Fishnet API. What are the key differences? Give one advantage of each over the other.
—END—