CSE logo University of Washington Department of Computer Science & Engineering
 UW Mangrove Project
  UW Mangrove Home     Search 

General
 Mangrove Home
 Semantic Email
 Annotation Example
Services/Demos
 Overview
 Calendar
 Who's Who
 Search
 PubsList
 Semantic Email
Semantic Tools
 Graphical Tagger
 Schema
 "Publish"

People

 Oren Etzioni
 Alon Halevy
 Jeff Lin
 Luke McDowell
   

Annotation Example (Or, how to add an event to the Department Calendar)

This document briefly describes how to annotate and publish an HTML document so that its content becomes semantically-enabled -- allowing its use by the Department Calendar and other semantic services. See also the instructions specifically for help with the Who's Who.

Annotation Basics

We assume that you already have a web-accessible text/HTML description of your event, perhaps on a course or personal home page. If not, create a simple description of your event and place it in some web accessible file. This description can be in any format, here is a simple example.

Annotation consists of adding semantic tags to your HTML document. For example,

    <html xmlns:uw="http://www.cs.washington.edu/research/semweb/vocab#v1_0">
    <uw:event>
        <uw:topic><i>New Grad</i> Orientation Lunch </uw:topic>
        <uw:date>Sep 26, 2002</uw:date> at
        <uw:time>1:00-2:30 p.m.</uw:time> on the 
        <uw:location>HUB Lawn</uw:location>.
    </uw:event>
    </html>

The semantic tags are ignored by traditional browsers and thus will not disrupt the look and feel of your home page. Likewise, HTML formatting tags (like the <i> tag inside the <uw:topic> element above) are ignored by the semantic parser, so you can tag your data without having to make any other changes to the HTML.

You have a choice of annotating a document either:

  • "By hand" (i.e. with an editor such as emacs)
  • Using the graphical annotation tool.

    Using the annotation tool is easier and just requires a simple download. The annotation tool will not do any annoying reformatting of your document.

    Once you have the basics working, you may want to try using the regular expression syntax in order to apply one set of tags to an entire table or list of items.

    Example using a simple text editor

  • 1. Open an HTML file that describes the event(s) using an editor such as emacs.
  • 2. Change the initial <html> tag to be:
        <html xmlns:uw="http://www.cs.washington.edu/research/semweb/vocab#v1_0">
  • 3. If this is a course/seminar page, add <uw:course> and </uw:course> tags surrounding everything between the <body> and </body> HTML tags. Then add <uw:courseCode>, <uw:name> (for the course name), <uw:time>, and <uw:location> tags (and closing tags) as appropriate.
  • 4. Add <uw:event> and </uw:event> tags surrounding the HTML information for the first event. If you have a number of events listed in a HTML table or list, you may want to use the regular expression syntax in order to apply one set of tags to the entire table or list.
  • 5. Add <uw:date> and </uw:date> tags around the date of the event. A wide variety of formats are acceptable, for instance "Feb 1, 2003" or "2/1/2003". A year is required.
  • 6. Do likewise for <uw:topic> and </uw:topic>
  • 7. (Optional) Add other tags as appropriate. For instance:
        <uw:time> (i.e. combinations of "3:00 p.m.", "2-3", or even "TuTh 2:30-3:20" [last requires uw:startDate/uw:endDate])
        <uw:location> (i.e. "EE1-026", "Chateau Conference Room")
        <uw:presenter> (i.e. "Alon Halevy")
    The schema provides a complete list of tags. See the list of allowable tags inside a <uw:event> tag, or start from the valid top-level tags.
    For a course, events inherit the location and time from the parent <uw:course> element, so there is no need to specify these unless they are different.
  • 8. Repeat steps 4-7 for any other events in the same HTML document.
  • 9. Save the document and then publish it to make it available to applications.
  • 10. If the "publish" output shows no errors, check out the calendar to see the result!

    Here's a simple example: Original document and the Annotated Version.

    NOTES:

    • Legal tags to use depend upon the semantic context, which is determined by the parent semantic tag. For instance,
      VALID:<uw:event> <uw:topic> Networks </uw:topic> <uw:topic> (<uw:topic> is a valid tag inside <uw:event>)
      INVALID:<uw:event> <uw:course> CSE 521 </uw:course> <uw:topic> (<uw:course> is not valid inside <uw:event>)
      The top-level tags are valid for use when there is no parent tag, such as when starting a document.
    • You may freely intermix HTML formatting tags and semantic tags. For instance,
      VALID: <uw:topic>The <b>Final</b> demo! </uw:topic>
    • However, semantic tags must be properly nested, just like with XML. For instance, do not do the following:
      INVALID: :<uw:event> <uw:topic> Networks </uw:event></uw:topic>
      (instead, the <topic> element must be completely nested inside the <event> element)
    • If you wish to add additional semantic information (such as a value for <uw:time>) that you don't want to appear in the rendered HTML, you can inline the information:
      VALID:  <uw:time value="10:00 a.m." /> (note the trailing / to terminate the element)

    Example using the Annotation Tool

    Follow these instructions for basics on how to download and use the tagger. Then follow the instructions above for how to use tags. When finished, you can publish your document by clicking the "Publish" button in the tagger.

    Troubleshooting

  • Make sure you modified the initial <html> tag as specified above.
  • Make sure you publish your document.
  • The semantic parser is not currently very user friendly. If you get parsing errors, check to make sure that every semantic tag has a matching end tag and that the semantic tags are properly nested.
  • Make sure your event has a <uw:date>(including a year) and a <uw:topic>.
  • When looking at the calendar, make sure you look at the right day. Also, try selecting "All Events" from the filter selection box at the top of the calendar.
  • This is a research prototype, so we are happy to help if you have any problems. Send mail to lucasm@cs.washington.edu.

  • CSE logo Department of Computer Science & Engineering
    University of Washington
    Box 352350
    Seattle, WA  98195-2350
    (206) 543-1695 voice, (206) 543-2969 FAX
    [comments to lucasm@cs.washington.edu]