Projects/Wikislice

A collection of projects to gather data from articles in Wikipedia and other wikis.

From some of the available wikislices on wikipedia, we're testing out ideas for using DITA topics and a DITA map for keeping up with changes to the wikislice content.

This page can serve as an upload page for work in progress and contains the background information from SJ Klein, the director of community content at One Laptop Per Child (OLPC).

Wikislice project page on Wikipedia: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wikislice

2007 Schools Wikipedia: http://schools-wikipedia.org/wp/index/subject.htm

Paraphrased from an email from SJ Klein, November 20, 2007:

We need to implement concrete use cases, using DITA with an editor and publishing pipeline. Say, use the DITA open toolkit, a specific editor, and one of the wikiproject:wikislice slices, and trying to keep up with changes to the slices.

My default for the 'final format' of the edited work would be an html collection, one html file per article, that preserves internal links to other articles in the slice, and leaves out broken links (or converts them into external links to wikipedia).

Identify added value that can come from having this structure to the data, other than identifying red/blue links.

Then it would help to contrast using DITA as an interchange with some of the simplest options: - using no metadata at all and having a named URL that always contains the latest version of a wikislice - using a static set of metadata that get updated when things change, along with a version number that is incremented

Michael and SJ discussed a few core use cases: maintaining and updating a stream that many people may be contributing to; and maintaining the latest version of a package that many people may be committing to, and which has some sort of 'latest' version.

Some distinctions to be made follow. It would be interesting to see specific implementations of the following notions in DITA format. 0) distinguish map updates from content updates

1) distinguish streams from patch updates to a static object that 'improves' over time

2) provide for pre- and post-filters; a pre-filter might be by keyword; a post-filter might be a transparent redirect fopr links that don't exist locally; keys that are resolved locally or externally.

3) other dimensions for feeds/streams:

- read-only v. read-write intearction with both streams and packages. Think of the latter as "two-way streams" and "distributed patch updates for packages" - linear v. distributed orderings (a single time-ordreing, or a distributed patch-ordering) - whether there are constraints that need to be applied when things are edited for read-write collections. michael : "export v feed"? perhaps the wrong term.

Projects/Wikislice

Navigation menu

Search