CSE 341 -- Programming LanguagesSpring 1999 |
Department of Computer Science and Engineering, University of WashingtonSteve Tanimoto (instructor) and Jeremy Baer (teaching assistant). |
Assignment P1Version 1.00 of May 16 Subject to change. |
Perl WarmupDue date and time: Wednesday, June 2, 1999, in class. (Note change of due date).Turn in this assignment as a hardcopy printout. For #1 show the program, and examples of input and output for 2 separate example files, the first of which uses the 5-line text file in the assignment. For #2, give printouts of (1) the HTML forms page as it looks in the browser, with the form areas filled out ready to be submitted, (2) the Perl source code, and (3) the web page that is generated by the script. In case you choose option 2a, give a printout of the email message, too. |
Instructions: Do both exercise 1 and exercise 2.
1. Write a perl program that processes a file and builds an inverted index that tells for each word in the file all of the line numbers where the word can be found in the file.
For example, if the input file contains the 5 lines of text...
This is a sample file
and the word zebra occurs
on lines 2 and 5. The
numbers here are treated
just like words such as zebra.
Then the inverted index would look like the following when printed out.
2: 3
5: 3
a: 1
and: 2, 3
are: 4
as: 5
file: 1
here: 4
is: 1
just: 5
like: 5
line: 5
numbers: 4
occurs: 2
on: 3
sample: 1
such: 5
the: 2, 3
this: 1
treated: 4
word: 2
words: 5
zebra: 2, 5
Note: All words have been converted to lower case, and punctuation has been ignored.
1.b. (Optional)
Peform a kind of "stemming" on the words as they are put into the index.
For example, after stemming, each of the following words become "jump".
jump, jumped, jumping.
Your solution to this is permitted to make mistakes. It doesn't
have to handle irregular verbs. You may be able to avoid converting
"swing" to "sw" by avoiding stemming whenever the result would be shorter
than 3 characters long.
2. Create a Perl script and test it with a web server and browser to do one of the following (your choice).
a. receive the values of an HTML form for some questions about programming languages, and then (a) email the results posted by the user to your own mail account, and (b) print a nice message that somehow "evaluates" the user's answers telling them something like "right", "wrong", "I agree", "you have good taste", etc., according to what they answered in the form.
b. receive a URL from the user via an HTML form, and then retrieve the document from that location and run the algorithm of exercise 1 (inverted index) on it, and finally format the results as HTML that is returned to the user.
c. implement a "vote counter" that lets a web surfer vote on some set
of candidates. It should (a) update the count for that candidate
selected, and display the result. It should also refuse to accept
a second vote for the same election from the same IP address. (Use
a browser cookie that names the group of candidates, i.e., the particular
election that the user
has voted in). It should be possible for the same web page to
have multiple elections (e.g., one for president, one for
secretary, etc.).
Teamwork: Do your work individually on this assignment.
Resources: The recommended platform for this assignment is Fiji. Each of you should have an account on this machine.