Adaptive Web Sites

Download web server logs for experiments

These logs come from the Music Machines web site at Hyperreal. They have been anonymized (stripped of all information about users except for their succession of accesses to the site).

New! More complete logs! Be sure and read the README.TXT file.

This data were used in our own research and we are making it available to the research community. Please contact Mike Perkowitz with questions.

Each file contains all accesses to Music Machines for a single day. The accesses are organized into paths. Each path is the series of URLs requested from a particular machine. Note that we do not distinguish among multiple users coming from the same source. We have, however, disabled caching of pages at the site so that every page must be requested, even when revisited.

A typical path will appear as below. The first line contains the originating machine (converted to unique numbers for the sake of anonymity). Each succeeding line corresponds to one URL requested from that machine. Each request contains the originating machine (O), the time of the request (T), the URL requested (U), and the referring URL (R). Fields are separated by "||".
O:0000002560 || T:1997/09/12-22:43:00 || U:/ || R:
O:0000002560 || T:1997/09/12-22:50:27 || U:/categories/software/ || R:
O:0000002560 || T:1997/09/12-22:50:38 || U:/categories/software/Windows/ || R:
O:0000002560 || T:1997/09/12-22:50:47 || U:/categories/software/Windows/V909V03.TXT || R:
O:0000002560 || T:1997/09/12-22:51:06 || U:/categories/software/Windows/ || R:
O:0000002560 || T:1997/09/12-22:51:18 || U:/categories/software/Windows/ravemusc.txt || R:

Files are named m.YYMMDD.paths, where YYMMDD represents the date. In our own experiment, we trained on one month of data and tested on another ten days. We present logs from September and October 1997, in one ZIP file per month.

September, 1997
October, 1997
or use FTP

Mike Perkowitz and Oren Etzioni Adaptive Web Sites