Annotation

From OLPC
Revision as of 04:44, 3 August 2007 by Alang (talk | contribs) (→‎API Proposals: Splitting this into a new page, putting the CA proposal in beside the older one. Oh, and I renamed it back to API.)
Jump to navigation Jump to search
  english | 한국어 HowTo [ID# 57103]  +/-  

We want to support annotation of any document, in a generalized way that can be supported by a unified aggregation and sharing system (where annotations/comments are similar to other objects in the object store). Media that should support annotation include documents and images; perhaps also any webpage or item viewed through a browser. In the extreme one can imagine adding notes to any moment in time using a laptop; associated as well as possible with a specific item with its own identifier, or a specific activity, or at least a combination of timestamp and screenshot and context.

We should support elegant libraries for displaying aggregated notes; levels of publicity (and perhaps ways to change this after the fact for clusters of notes) and ways to highlight annotations and reviews as they take place.

See also content stamping for a specific kind of annotation that supports reviewing.

What's an Annotation?

An annotation is any kind of data imposed onto another page/document/object. Generally you do not need the permission of the author to add these comments or discussion. You may share your annotations with other users, or they may be private.

An annotation may be:

  • A comment that applies to a specific range of text
  • Something directed at a coordinate location in a PDF or image
  • A comment applied to a document generally
  • A comment applied to another annotation (forming a threaded discussion)
  • A rating or recommendation
  • A copyedit intended for the author
  • No comment, but simply the highlighting of a range of text or a pointer to something in a PDF (indicating a vague sense of "this is important or interesting")

As a result there are many optional aspects to an annotation -- the comment text is optional, the text range is optional, tags are optional, ratings are optional, etc.

Desired Features

Querying and Aggregation

It is useful to aggregate annotations. In the simplest case, we want to retrieve annotations from several sources.

Automatically aggregated annotations can also be useful. An aggregator may pull together annotations from many sources and either republish a selection of the annotations. For example, the aggregator may drop what it judges to be spam, or only republish what it judges to be the most interesting annotations.

Identifying Targets

We weren't able to find any existing protocols for specifying target content, so we identified the two main use cases:

  1. Commenting on a page as a whole (Digg-like).
  2. Commenting on specific sections of a page.

Threading

RFC4685 covers ATOM threading in detail.

Rating

Tagging/Categorisation

Publishing

Viewing Annotations

When annotations are separate from the underlying work, one can see a constellation of notes from many people. A few views which we want to readily support:

  • no comments
  • my own comments
  • comments from a group (myself/class/teachers)
  • all comments
  • new comments

We also want to limit the types of annotation viewed to an area of interest:

  • Point-and-click annotation associated with a spot on an image or page
  • Selection annotation associated with a string in a document or region in an image
  • Block annotation associated with a paragraph or block in a document or region in an image
  • Document-level annotation such as tags or reviews

API Proposals

Here are two proposals.

  1. [Original Annotation API Proposal]
  2. [Comment Anywhere Annotation Protocol Proposal]

Original API Proposal

(Per discussion with Ian Bicking and Joshua Gay).

A proposal for an Atom representation of an annotation based on the Atom syndication standard.

Here's an example from that document:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  
  <title>Example Feed</title> 
  <link href="http://example.org/"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author> 
    <name>John Doe</name>
  </author> 
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
  
  <entry>
    <title>Atom-Powered Robots Run Amok</title>
    <link href="http://example.org/2003/12/13/atom03"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <summary>Some text.</summary>
  </entry>
  
</feed>

An annotation is a single Atom entry:

<entry
  xmlns:ann="http://wiki.laptop.org/go/Annotation">
  <title></title>
  <link href="{document being commented on}" />
  <id>urn:uuid:blahblahblah</id> 
  <updated>YYYY-MM-DDTHH:MM:DD</updated>
  <source>{uri}</source>
  <content type="html">
    Delete this term
  </content>
  <category term="copyedit" scheme="http://laptop.org" />
  <category term="{tag}" scheme="?" />
  <ann:selected-text>Some text that was highlighted</ann:selected-text>
  <ann:pointer>/html[1]/body[1]/div[2]</ann:pointer> 
  <author>
    <uri>{open ID URI}</uri>
  </author>
</entry>

Entries can be posted with the Atom Publishing Protocol another IETF standard.

This is based on a APP Collection. This is some base URI, e.g.,:

 http://localhost/APP/

You POST to this URI with this entry as the content body. The server will respond with a Location header that indicates where the entry has been placed. This value will be put into the <source> tag, indicating where the entry is now stored. Later updates to the comment are done by PUTting to this URI, with the new entry. Removing the comment is done by DELETE'ing the URI.

When you do GET /APP/ you will get an Atom feed. This is basically a set of entries enclosed in a <feed> element.

When you load a page, you would load any comment feeds in which you have indicated interest. This would happen asynchronously -- locally-hosted or cached comments will come up quickly, but potentially other comments would come up more slowly. This also involves fetching data from other domains, which is currently barred in Javascript with XMLHttpRequest; we'll have to create an exception to the permissions.

Threading Comments

Annotations can have comments; effectively annotations on the annotation. This can be done using the in-reply-to extension. Lets say you leave a comment on the previous example entry, you'd have:

 <entry
  xmlns:ann="http://wiki.laptop.org/go/Annotation"
  xmlns:thr="http://purl.org/syndication/thread/1.0">
  <title></title>
  <link href="{source uri from other entry; or to original document?}" />
  <id>urn:uuid:04f9820d03</id> 
  <thr:in-reply-to ref="urn:uuid:blahblahblah"
   href="{source uri from other entry}"
   type="application/atom+xml" />
  <updated>YYYY-MM-DDTHH:MM:DD</updated>
  <content type="html">
    This annotation is dumb.
  </content>
  <author>
    <uri>{open ID URI}</uri>
  </author>
</entry>

Redundancy and Threading

If an annotation is attached to another annotation that isn't available (maybe it hasn't been uploaded, maybe it's been deleted, maybe it's private, etc), then the tree-like threading of the comments starts to fall apart. To make this more reparable, some redundant information will be included in the Atom entry. A link to the most-parent document will be retained through all entries. In the case of an annotation that is attached to a specific piece of text, that text will also be copied. Then the UI may place the orphaned comment someplace appropriate. The orphaned comment should still look lost -- it's not in its proper context, and ideally it would be moved or deleted or edited to make it more appropriate.

hAtom

hAtom is a Microformat for putting Atom entries in HTML. We may way to represent the inline comments using this, mostly to provide a satisfying sense of symmetry in what is displayed, serialized, and created by the annotation process. But not all kinds of annotations will fit into this.

Pointers

While annotations can be made on pages (URLs), they can also be made on parts of a page. There will be a variety of ways of indicating the point at which the comment applies.

For PDF documents, a combination of page and coordinates can be used. For HTML an XPath document can be used. For images in HTML an XPath plus coordinates could be used. The format can be extended if there are other useful ways of indicating positions -- for example, a link between two locations.


XSS Security

We will be injecting other people's HTML into content. We must be sure this HTML does not contain dangerous stuff, like Javascript that itself calls XMLHttpRequests. We must be sure to scrub the HTML carefully. It is difficult to do this in Javascript, but that would be most secure (on the client when loading the comments). We could require XHTML, embedded in the Atom, to do this. Or, we could rely on server-side filtering of the HTML.


References