The interactive note-taking software is designed to help users capture information digitally, both to speed entry and improve accuracy, and to support the longer term goal of efficient retrieval. The software incorporates two distinctive features. First, it actively predicts what the user is going to write. Second, it automatically constructs a custom radio-button, check-box user interface.
This research explores the extremes of FSM learning and prediction, where the system has no explicit a priori knowledge of the note domains. We have tried to design the system so that it can learn quickly, yet adapt well to semantic and syntactic changes, all without a knowledge store from which to draw. It is clear that knowledge in the form of a domain-specific tokenizer would aid FSM learning by chunking significant phrases and relating similar notations and abbreviations. Some preliminary work has shown that, after a few notes have been written, users may create abbreviations instead of writing out whole words. A domain-specific tokenizer would be able to relate an abbreviation and a whole word as being in the same class, and therefore allow for more flexibility during note taking. For example, a domain-specific tokenizer may recognize that "Megabytes", "Meg", "MB", and "M" all represent the same token for memory sizes. One could imagine a framework that would allow for domain-specific tokenizers to be simply plugged in.
The prototype built to demonstrate these ideas was implemented on a conventional, micro computer with keyboard input. As a consequence, it was impossible to evaluate user acceptance of the new interface or the adaptive agent. With newly available computing devices incorporating pen input and handwriting recognition, it should be possible to re-engineer the user interface and field test these ideas with actual users.
One aspect of note learning, related to tokenization and the button-box user interface display, is the difficulty of generalizing numeric strings or unique tokens. The cardinality of the range of model numbers, telephone numbers, quantities, sizes, other numeric values, and even proper names is very large in some note domains. The finite-state machine learning method presented here is incapable of generalizing over transitions from a particular state, and, as a consequence, the current system has the problem of displaying a very lengthy button-box interface list. (A button is displayed for each value encountered in the syntax of notes, and there may be many choices.) For example, a large variety of pattern numbers may be available in the fabric pattern note domain. An appropriate mechanism is desired to determine when the list of numeric choices is too large to be useful as a button-box interface. The system can then generalize the expected number, indicating the number of digits to prompt the user: ####, for example. This may be helpful to remind the user that a number is expected without presenting an overbearing list of possibilities.
Another limitation of the current effort lies in the choice of finite-state machines to represent the syntax of the user's notes. Notes may not be regular expressions with the consequence that the FSMs may become too large as the learning method attempts to acquire a syntax. This may place an unreasonable demand on memory and lead to reduced prompting effectiveness.
The choice of finite-state machines also apparently constraints the custom user interface. Because FSMs branch in unpredicable ways, button-box interfaces must be rendered incrementally. After the user indicates a particular transition (by selecting a button), the system can render states reachable from that transition for the user. Ideally, the user should be able to select buttons corresponding to note fragments in any order, allowing them to write down the size before the pattern number, for example. To construct a non-modal user interface, a more flexible syntactic representation is needed.
Several of the low-level design decisions employed in this system are crude responses to technical issues. For instance, the decision to render a syntax as a button-box interface only if the average number of times each state has been used to parse notes is greater than 2. This ignores the fact that some parts of the state machine have been used frequently for parsing notes while other parts have rarely been used. Similarly, the particular measure for estimating prompting confidence (and setting the saturation of the completion button) is simplistic and would benefit from a more sound statistical basis.
Anonymous reviewers suggested an additional example in Section 3, offered some refinements to the user interface, graciously identified some limitations of the work listed in Section 9, and pointed out some additional related work. Mike Kibler, Karl Hakimian, and the EECS staff provided a consistent and reliable computing environment. Apple Cambridge developed and supports the Macintosh Common Lisp programming environment. Allen Cypher provided the tokenizer code. This work was supported in part by the National Science Foundation under grant number 92-1290 and by a grant from Digital Equipment Corporation.