Talk:Content stamping

From OLPC
Jump to: navigation, search

Ambiguities

It seems like it might be unclear when someone is making a review of a website, and just of one web page, or some set of web pages. It's sometimes (but not always) clear to humans; less so to computers.

Examples of ambiguities:

  • An interesting article that has been split across multiple pages. All pages form the work.
  • A blog that has timely information on a subject, e.g., current events. Future pages that don't yet exist may effectively fall under the review.
  • An entire website that has useful information.
  • A web application that can only really be used interactively; the form not the content is what is interesting.

-- Ian Bicking


Valuation functions and general automation seems complex and unreliable to me. In theory it could be useful, but in practice you need a lot of rating to get something meaningful -- not because of the quality of the individual reviews (which may all be just fine), but because of the lack of a common rating standard, and even common criteria. So group A might find lots of new, interesting content, while group B is looking for a small set of content directly related to one subject area. The ratings of group A could be very helpful to group B, as they identify potentially interesting information. But the actual selection process is something group B wants to do themselves. Mixing the two groups really enables group B to do all of their own selection, as they can focus on a smaller set of content that has had some basic vetting. Aggregating and weighing ratings there doesn't seem very useful. -- Ian Bicking


Latest revisions

Identifying the current revision seems a little difficult. In theory Last-Modified could be used to determine this, but many pages are aggregations of the specific content and the site content, and the site content is often updated. E.g., a sidebar is updated, which changes the Last-Modified of the entire page, even though the substance of the content does not change. Identifying the "real" content would be very useful, but there's no general way to do that. Only with specific screen scraping, some microformat (though I don't know one currently), maybe RDF, etc., could we identify a meaningful revision or modification date for an item. However, we could build up some set of patterns, with something like RDF for new content that is specifically designed to be read by this system. -- Ian Bicking

Discussion from game development

As discussed on april 1 -- there is some reviewing that will be core; having an official developer program may mean announcing something is ready to be included under a basic important stamp -- don't promote a game for the xo unless its ready.

Aside: having some libraries for creation may be part of a signed update package. specific games may not be.

Developers program: we want as many things as possible to earn this [set of core] badge[s] out of the gate. not targeting the 'control' standards of regular platforms. but getting sites like jay's games, gamer dad, etc. to open things and run them, and come back with reviews.

Ian, from OLPC mailing list, March 2007

I was just talking to SJ about tagging and collecting content for the library and all that. My first idea was that a collection of content is just a bunch of links, and people manage that bunch of links however they want (e.g., on a wiki). Then that seemed a little too crude, and I remembered about hReview: http://microformats.org/wiki/hreview -- it's a microformat (markup embeddable in HTML) for representing reviews. I've been reading the page some more, and I'm totally digging how it could be used.

So I'm imagining a tool where you give it a URL or a few URLs, and grab those pages and look for this hReview markup. You can embed comments, tags, and ratings inside the markup. It'll grab all of these, and then let you query what it's grabbed. So at that point you could say "show me what pages are rated 2 or higher", or "show me everything tagged with 'chemistry'" or whatever. Then when you are satisfied, you can create a stand-alone bundle of all that content. Or maybe just create a table of contents for your new View Of The Web (for online viewing). Or take back the information you've collected and look for some content you feel is missing.

This markup could be added to existing apps or services, and we don't have to do any special integration with one particular service. There are some services that produce it: http://microformats.org/wiki/hreview-examples-in-wild -- I haven't really digested this list, so I don't know what is appropriate for us.

Maybe an example would be useful:

 <div class="hreview">
  <span class="item">
   <a class="url fn"
    href="http://wiki.laptop.org/go/GTK_for_OLPC">GTK for OLPC</a>
  </span>
  <div class="description">
   <p>
    Covers the differences of using GTK in an OLPC environment, and using
    GTK in more traditional environments.
   </p>
  </div>
  (<abbr class="rating" title="4">****</abbr>)
  <ul>
   <li class="rating">
    <a href="http://wiki.laptop.org/go/Introduction_to_Programming_OLPC"
     rel="tag">
     Programming <span class="value">3</span>
    </a>
   </li>
   <li class="rating">
    <a href="http://wiki.laptop.org/go/Introduction_to_Sugar"
     rel="tag">
     Sugar <span class="value">5</span>
    </a>
   </li>
  </ul>
  <p class="reviewer vcard">Review by
   <a class="url fn" href="http://ianbicking.org">Ian Bicking</a>,
   <abbr class="dtreviewed" title="200703">March '07</abbr>
  </p>
 </div> 

So, in this, I've said:

  • What the URL is (fn means this is the thing I'm rating)
  • A comment (description)
  • An overall rating (of 4)
  • I've tagged it with Programming (the link to the wiki page implies a

specific meaning -- in this case probably a tutorial I'm trying to assemble), with a rating related to that

  • And I've tagged it with GUI, with a different rating (it's more

relevant there)

  • And I've said who I am, and when I made this review

They have a tool to help you make this code: http://microformats.org/code/hreview/creator -- but for some subset we could probably convert simpler formats (e.g., what can be easily entered on a wiki), and tagging sites already have structured data of their own that they can just present in this format.

What occurred to me this morning is that you can indicate the type as well, and I'm thinking about a "review" type. So you are basically creating a review of a review. The tool sees this, and then brings in all of *those* reviews, allowing you to collect information from a variety of sources. And you can bring them in with different metrics. An example I can imagine:

<div class="hreview">
 <span class="item">
  <a class="url fn"
   href="{page with reviews}">Joe's reviews</a>
 </span>
 <div class="description">
  <span class="type" title="review">
   Joe collects a lot of great links about frogs
  </span>
 </div>
 <ul>
  <li class="rating">
   <a href="http://wiki.naturalsciences.org/amphibians"
    rel="tag">
    Amphibians <span class="value">5</span>
   </a>
  </li>
 </ul>
</div>

And all the links would be collected there, and the amphibian tag will be added to every item, and maybe some adjustment of Joe's ratings will be applied (probably some weight applied to the rating based on the rating of his reviews).

Sorry for the length of this; I just got excited when I started thinking about review aggregation as just being a recursive process. I'm also stoked about hReview.

URL / URI

It seems that the most important aspect to content stamping is to make sure can keep track of what content has been "stamped." I am curious what our method for URL/URI tracking should be? In particular, it would be nice to know that if a particular piece of content has multiple translations, then they associate to some root-URI, or something. Or, make sure that version of date 07/05/07 of a page was appropriate on something that keeps track of such information. And so forth. Perhaps we could add to the ATOM standard etched out in Annotation --- perhaps we could have a olpc namespace for things that become overly specific, such as location. I'm not sure, I really don't know how to approach this problem, and I'm not sure anybody has, at least form a meta-data stand point.

-- Joshua Gay

reviving discussion

copied from email to sj, lauren

Found a thread started by Jeff on March 14, pointing to the Content Stamping page as a design sketch. It ended with an email from Chris Bailey at RatePoint suggesting a collaboration. Later, Ian emailed the olpc list with some snips and a proposal (copied to Talk:Content Stamping. What happened with that? Also, recent emails make it sound as if OpenPlans is potentially interested in getting an intern to implement Content Stamping for us. When should we talk to them?

Mchua 13:44, 12 July 2007 (EDT)

"trust" system for wikis: another approach

Same problem, different approach: work within a wiki, not on top of it. http://trust.cse.ucsc.edu/ describes a system for marking pages with the "trustability" of the text, as determined by the "trustability" of the authors that contributed it - which in turn is computed from finding out how long the edits from that particular last on the wiki. Mildly OT (it won't help w/ the content stamping framework) but once that's done, this might be

  • a nice way to bootstrap the creation of content stamping groups & ratings - start by populating the content stamping system with these automated "stamps"
  • an interesting complementary system that works within the wiki in parallel with the Content Stamping working "on top of" the wiki.
  • an interesting user interface idea... although the colored text would start to annoy me after a while, it might be a cool way to interact with whatever content stamping framework there ends up being.

Anyhow, carry on. --Mchua 04:19, 5 August 2007 (EDT)

shiftspace

Ian pointed me to http://www.shiftspace.org which is a layer atop Firefox (yes, another plugin). Would this work? Worth contacting? Mchua 00:47, 30 January 2008 (EST)