handout #16
CSE143—Computer Programming II
Programming Assignment #5
due: Sunday,
(courtesy of Stuart Reges)
This assignment will give you practice with recursion and grammars. You will complete a program that reads an input file with a grammar in Backus-Naur Form (BNF) and will allow the user to randomly generate elements of the grammar.
You will be given a main program that does the file processing and user interaction. It is called GrammarMain.java. You are to write a class called GrammarSolver that manipulates the grammar. A grammar will be specified as a sequence of Strings, each of which represents the rules for a nonterminal symbol. Each String will be of the form:
<nonterminal symbol>:<rule>|<rule>|<rule>|…|<rule>
Notice that this is the standard BNF format of a non-terminal symbol on the left-hand-side and a series of rules separated by vertical bar characters (“|”) on the right-hand side. If there is only one rule for a particular nonterminal, then there will be no vertical bar characters. Normally BNF rules have the characters “::=” separating the symbol from the rules, so this is a slight variation where the separator is a simple colon (“:”).
There will be exactly one colon per String. The text appearing before the colon is a nonterminal symbol. You may assume it does not contain any whitespace. Often we surround nonterminal symbols with the characters “<” and “>”, but this will not always be the case. The text appearing after the colon will be a series of rules separated by vertical bar characters (“|”). Each of these rules will have a series of tokens separated and potentially surrounded by whitespace. There could be any amount of whitespace surrounding tokens. Any token that appears to the left of a colon in the grammar is considered a nonterminal. All other tokens are considered terminals.
The grammars you will be asked to process will be stored in text files with each line of the file being of the form described above. GrammarMain reads this file into an ArrayList of Strings and passes the list to the constructor of your GrammarSolver. Your solver has to be able to perform certain tasks, most notably generating random elements of the grammar.
To generate a random instantiation
of a non-terminal, you simply pick at random one of its rules and generate
whatever that rule tells you to generate.
Notice that this is a recursive process.
Generating a non-terminal involves picking one of its rules at random
and then generating each part of that rule, which might involve more
non-terminal symbols to generate for which you pick rules at random and
generate each part of those rules, and so on.
Depending upon the grammar, this process could continue indefinitely. Any grammar you will be asked to work with
will be guaranteed to converge in a finite period of time. Most often this process doesn’t go on
indefinitely because many rules involve terminals rather than
non-terminals. When you encounter a terminal,
you simply include it. This becomes the
base case of the recursive process. Your
generating method produces an array of Strings.
Each String should be compact in
the sense that there should be exactly one space between each terminal and
there should be no leading or trailing spaces.
Your class must include the following public methods.
Method |
Description |
GrammarSolver(ArrayList grammar) |
Constructs a grammar solver for the given grammar. Throws a NullPointerException if the grammar is null and throws an IllegalArgumentException if there are two or more entries in the grammar for the same non-terminal. |
boolean grammarContains(String symbol) |
Returns true if the given symbol is a non-terminal of the grammar; returns false otherwise. |
String[] generate(String symbol, int times) |
Uses the grammar to randomly generate given number of occurrences of the given symbol and returns the result as an array of Strings. For any given non-terminal symbol, each of its rules should be applied with equal probability. Throws an IllegalArgumentException if the grammar does not contain the given non-terminal symbol or if the number of times is less than 0. |
String getSymbols() |
Returns a String representation of the various non-terminal symbols from the grammar as a sorted, comma-separated list enclosed in square brackets, as in “[<np>, <s>, <vp>]” |
Case matters when comparing symbols. For example, <S> would not be considered the same as <s>.
You are allowed to use whatever constructs you want from the Java class libraries, but there are several highly convenient options described below that you are strongly advised to consider.
The regular expressions we want are fairly simple. For example, to split a string on the colon character you simply put the colon inside square brackets. The split method returns an array of strings, which means we can perform this split by saying:
String[] parts = s.split("[:]");
This square bracket notation works for any single character, even a space. In the case of whitespace, we want to include both spaces and tabs and we want to have one or more of them. This can be accomplished in a regular expression by putting both a space and a tab inside the square brackets and putting a plus sign after the brackets which indicates “1 or more of these”:
String[]
parts = s.split("[ \t]+");
As mentioned above, the various parts of a rule are guaranteed to be separated by whitespace. Otherwise you would have a difficult time separating the parts of a rule. But this means that once you’ve used the spaces to split the rule up, the spaces are gone. That means that when you generate Strings, you will have to include spaces yourself.
In terms of correctness, your class must provide
all of the functionality described above.
In terms of style, we will be grading on your use of comments, good
variable names, consistent indentation and good coding style to implement these
operations. Remember that you will lose
points if you declare variables as data fields that can instead be declared as
local variables. You should also avoid
extraneous cases (e.g., don’t make something into a special case if it doesn’t
have to be).
You MUST
name your file GrammarSolver.java and turn it in electronically from the
“assignments” link on the class web page.
A collection of files needed for the assignment is included on the web
page as ass5.zip. You will need to have GrammarMain.java
and Scanner.java in the same directory as your GrammarSolver.java in order to
run GrammarMain. The second input file
contains extraneous whitespace, including tabs.
Sample input file (sentence.txt)
<s>:<np>
<vp>
<np>:<dp>
<adjp> <n>|<pn>
<pn>:John|Jane|Sally|Spot|Fred|Elmo
<adjp>:<adj>|<adj>
<adjp>
<adj>:big|fat|green|wonderful|faulty|subliminal|pretentious
<dp>:the|a
<n>:dog|cat|man|university|father|mother|child|television
<vp>:<tv>
<np>|<iv>
<tv>:hit|honored|kissed|helped
<iv>:died|collapsed|laughed|wept
Sample input file (sentence2.txt)
E: T
| E OP T
T: x
| y |
42 | 0 | 1 | 92 | ( E ) | F1 ( E
) | - T | F2 ( E , E )
OP: +
| - |
* | %
| /
F1: sin
| cos| tan |sqrt
| abs
F2:max |min |
pow
Main program GrammarMain.java
// Stuart Reges
//
//
// GrammarMain contains a main program that prompts
a user for the name of a
// grammar file and then gives the user the
opportunity to generate random
// versions of various elements of the grammar.
import java.io.*;
import java.util.*;
public class GrammarMain {
public
static void main(String[] args) throws FileNotFoundException {
Scanner console = new Scanner(System.in);
System.out.println("Welcome to the cse143 random sentence
generator.");
System.out.println();
//
open grammar file
System.out.print("What is the name of the grammar file? ");
String
fileName = console.nextLine();
Scanner input = new Scanner(new File(fileName));
//
read the grammar file and construct the grammar solver
ArrayList grammar = new ArrayList();
while
(input.hasNextLine())
grammar.add(input.nextLine());
GrammarSolver solver = new GrammarSolver(grammar);
showResults(console, solver);
}
// pre :
console open for console reading, solver initialized
// post:
allows the user to repeatedly pick a grammar element to generate
public
static void showResults(Scanner console, GrammarSolver solver) {
for(;;) {
System.out.println();
System.out.println("Available symbols to generate are:");
System.out.println(solver.getSymbols());
System.out.print("What do you want generated (return to quit)?
");
String target = console.nextLine();
if
(target.length() == 0)
break;
if
(!solver.grammarContains(target))
System.out.println("Illegal symbol");
else {
System.out.print("How many do you want me to generate? ");
if (!console.hasNextInt())
System.out.println("that's
not an integer");
else {
int number = console.nextInt();
if (number < 0)
System.out.println("no negatives allowed");
else {
String[] answers =
solver.generate(target, number);
for (int i = 0; i <
number; i++)
System.out.println(answers[i]);
}
}
console.nextLine(); // to position to next line
}
}
}
}
Sample execution (user input underlined)
Welcome
to the cse143 random sentence generator.
What
is the name of the grammar file? sentence.txt
Available
symbols to generate are:
[<adj>,
<adjp>, <dp>, <iv>, <n>, <np>, <pn>,
<s>, <tv>, <vp>]
What
do you want generated (return to quit)? <dp>
How
many do you want me to generate? 5
the
a
a
the
a
Available
symbols to generate are:
[<adj>,
<adjp>, <dp>, <iv>, <n>, <np>, <pn>,
<s>, <tv>, <vp>]
What
do you want generated (return to quit)? <np>
How
many do you want me to generate? 5
Jane
the
wonderful pretentious fat big father
Fred
the
wonderful cat
a
subliminal pretentious dog
Available
symbols to generate are:
[<adj>,
<adjp>, <dp>, <iv>, <n>, <np>, <pn>,
<s>, <tv>, <vp>]
What
do you want generated (return to quit)? <s>
How
many do you want me to generate? 20
Jane
kissed Spot
the
wonderful wonderful pretentious television collapsed
Jane
hit Fred
Jane
helped a wonderful dog
Elmo
helped the fat fat man
a
pretentious university helped Elmo
a
wonderful green subliminal father hit Fred
Fred
kissed Spot
Spot
laughed
the
green wonderful father collapsed
the
big man helped John
a
pretentious green faulty dog collapsed
Jane
honored a green subliminal green child
Elmo
hit Elmo
a
green university died
the
pretentious child honored a faulty wonderful subliminal television
Jane
died
the
faulty dog hit John
Elmo
helped Fred
Elmo
honored the pretentious big green father
Available
symbols to generate are:
[<adj>,
<adjp>, <dp>, <iv>, <n>, <np>, <pn>,
<s>, <tv>, <vp>]
What
do you want generated (return to quit)?
Sample execution (user input underlined)
Welcome
to the cse143 random sentence generator.
What
is the name of the grammar file? sentence2.txt
Available
symbols to generate are:
[E,
F1, F2, OP, T]
What
do you want generated (return to quit)? T
How
many do you want me to generate? 5
42
- y
x
x
( (
1 ) )
Available
symbols to generate are:
[E,
F1, F2, OP, T]
What
do you want generated (return to quit)? E
How
many do you want me to generate? 10
x -
1
0
sin
( 1 + 92 + - 1 / 42 )
max
( y , 92 )
42
% 1
-
42
92
1
92
42
- sin ( 1 )
Available
symbols to generate are:
[E,
F1, F2, OP, T]
What
do you want generated (return to quit)?