Given the fact that MUSE AC-1 operates on a composite data structure, the benefits of using this algorithm can have a high payoff over individually processing CSPs. In section 2.4, we provided several examples where the payoff is obvious. To gain some insight into factors influencing the effectiveness of MUSE CSP, we have conducted an experiment in which we randomly generate MUSE CSP instances with two different graph topologies. The tree topology is characterized by two parameters: the branching factor (how many nodes follow each non-leaf node in the tree) and the path length (how many nodes there are in a path from the root node to a leaf node). The lattice topology is characteristic of a MUSE CSP which is produced by a hidden-Markov-model-based spoken language recognition system for our constraint-based parser. Lattices are also characterized by their length and their branching factor.
For this experiment, we examined trees
with a path length of four and a branching factor of two or three, and
lattices with a path length of four and a branching factor of two or three.
We initialized each variable to have either 3 or 6 labels.
We then randomly generated constraints in
the network, varying the probability that
= 1
from 0.05 to .95 in steps of 0.05. For each probability, 6 instances
were generated. The lower the probability
that
= 1, the tighter the constraints. Note
that the probability of a constraint between
two nodes should be understood as the probability of a constraint
between two nodes given that a constraint is allowed between them.
For example, nodes that are on the same level in the tree topology are
in different segments, and so constraints cannot occur between them.
The results of this experiment are displayed in Figures 26 and 27. In each of the four panels of each figure, four curves are displayed. After MUSE AC-1 appears on curves displaying the average number of labels remaining after MUSE AC-1 is applied to instances of a MUSE CSP as the probability of a constraint varies. The curves labeled Solution indicate the average number of labels remaining after MUSE AC-1 that are used in a solution. CSP AC is associated with curves that display the number of labels that remain in at least one segment when the segment is extracted from the MUSE CSP and CSP arc consistency is applied. Unused indicates the difference between the number of labels that remain after MUSE AC-1 and the number that are CSP arc consistent in at least one segment.
Figure 26: Simulation results for trees with a path length of 4, a branching factor of 2 or
3, and 3 or 6 labels per variable.
Figure 27: Simulation results for lattices with a path length of 4, a branching factor of 2 or
3, and 3 or 6 labels per variable.
For both of the topologies, if the
probability
= 1 is low (e.g., .1) or
high (e.g., .8), then MUSE AC-1 tracks the performance of arc
consistency performed on the individual instances for either topology.
However, the topology does impact the range of low and high
probabilities for which this is true.
When constraints are randomly generated, after
MUSE AC-1 is performed, the tree topology has fewer remaining values
than the lattice topology that are not CSP arc consistent.
These results suggest that MUSE
CSP AC-1 may be more effective for some topologies than for others.
However, in the tree topology the randomly generated constraints
between the values of two variables are independent of the other
probabilities generated. This is not the case for the lattice; once
a pair of variables has a set of randomly generated constraints, they
are shared by all paths through the lattice. Notice that increasing
the number of values in a domain seems to have more impact on the
tree than increasing the branching factor, probably because as the
branching factor increases, so does the number of independent nodes.
This experiment does show that if a problem is tightly constrained, MUSE AC-1 can be effectively used to eliminate values that are unsupported by the constraints. Clearly, this was the case for the parsing problems presented in section 2.4. A small set of syntactic constraints effectively eliminates values that can never be used in a parse for a sentence, even in a lattice with a branching factor of three and arbitrarily long paths.