3.4. The Correctness of MUSE AC-1

3.5. A Profile of MUSE AC-1

Given the fact that MUSE AC-1 operates on a composite data structure, the benefits of using this algorithm can have a high payoff over individually processing CSPs. In section 2.4, we provided several examples where the payoff is obvious. To gain some insight into factors influencing the effectiveness of MUSE CSP, we have conducted an experiment in which we randomly generate MUSE CSP instances with two different graph topologies. The tree topology is characterized by two parameters: the branching factor (how many nodes follow each non-leaf node in the tree) and the path length (how many nodes there are in a path from the root node to a leaf node). The lattice topology is characteristic of a MUSE CSP which is produced by a hidden-Markov-model-based spoken language recognition system for our constraint-based parser. Lattices are also characterized by their length and their branching factor.

For this experiment, we examined trees with a path length of four and a branching factor of two or three, and lattices with a path length of four and a branching factor of two or three. We initialized each variable to have either 3 or 6 labels. We then randomly generated constraints in the network, varying the probability that tex2html_wrap_inline18961 = 1 from 0.05 to .95 in steps of 0.05. For each probability, 6 instances were generated. The lower the probability that tex2html_wrap_inline18961 = 1, the tighter the constraints. Note that the probability of a constraint between two nodes should be understood as the probability of a constraint between two nodes given that a constraint is allowed between them. For example, nodes that are on the same level in the tree topology are in different segments, and so constraints cannot occur between them.

The results of this experiment are displayed in Figures 26 and 27. In each of the four panels of each figure, four curves are displayed. After MUSE AC-1 appears on curves displaying the average number of labels remaining after MUSE AC-1 is applied to instances of a MUSE CSP as the probability of a constraint varies. The curves labeled Solution indicate the average number of labels remaining after MUSE AC-1 that are used in a solution. CSP AC is associated with curves that display the number of labels that remain in at least one segment when the segment is extracted from the MUSE CSP and CSP arc consistency is applied. Unused indicates the difference between the number of labels that remain after MUSE AC-1 and the number that are CSP arc consistent in at least one segment.

   figure16395
Figure 26: Simulation results for trees with a path length of 4, a branching factor of 2 or 3, and 3 or 6 labels per variable.

   figure16406
Figure 27: Simulation results for lattices with a path length of 4, a branching factor of 2 or 3, and 3 or 6 labels per variable.

For both of the topologies, if the probability tex2html_wrap_inline18961 = 1 is low (e.g., .1) or high (e.g., .8), then MUSE AC-1 tracks the performance of arc consistency performed on the individual instances for either topology. However, the topology does impact the range of low and high probabilities for which this is true. When constraints are randomly generated, after MUSE AC-1 is performed, the tree topology has fewer remaining values than the lattice topology that are not CSP arc consistent. These results suggest that MUSE CSP AC-1 may be more effective for some topologies than for others. However, in the tree topology the randomly generated constraints between the values of two variables are independent of the other probabilities generated. This is not the case for the lattice; once a pair of variables has a set of randomly generated constraints, they are shared by all paths through the lattice. Notice that increasing the number of values in a domain seems to have more impact on the tree than increasing the branching factor, probably because as the branching factor increases, so does the number of independent nodes.

This experiment does show that if a problem is tightly constrained, MUSE AC-1 can be effectively used to eliminate values that are unsupported by the constraints. Clearly, this was the case for the parsing problems presented in section 2.4. A small set of syntactic constraints effectively eliminates values that can never be used in a parse for a sentence, even in a lattice with a branching factor of three and arbitrarily long paths.



3.6. Extracting Solutions from a MUSE CSP after MUSE AC-1