Content repositories

From OLPC
Revision as of 17:41, 11 January 2007 by Ian Bicking (talk | contribs) (Added a read-only use case)
Jump to: navigation, search
Philosophy
Creating Content
Curating Content
Educational ideas
Activity ideas
Software ideas
Hardware ideas
Help Translating
Library
Content network
Repositories
Collections
modify 

By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries in a number of languages.

We are gathering a list of those materials. To add to this any materials you know of which would be free for laptop users to access, see content ideas. For more information on sharing your content if you are a publisher, author or collection holder, see Sharing your content with OLPC.

Subsets of large archives

There are some large archives available for inclusion in the content repository. Flickr and Wikimedia Commons; Project Gutenberg and Google Scholar; Wikipedia; WiktionaryZ / OmegaWiki and Dicologos; the Humanity Development Library and like collections; the list goes on.

There are tools being developed for identifying and culling subsets of large repositories. Small subsets will be needed for pre-installation of the choicest content on the laptops themselves. Larger subsets will still need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. All content (aside from images and media that are naturally alingual) will need some processing to make localization easier.

Use cases

Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school.


Extended use case #1

A teacher proposes a collaborative writing project for her class, and later tries to integrate the students' work with Wikipedia. Note the general issues of merging collaborations, and of integration of revision histories from the repository with outside repositories.

  • Alia writes a document about daisies -- about how she thinks they are pretty and she likes picking them for her mother. She shares this on the school repository. It has no general interest.
  • Bea (Alia's friend) gets the document and adds to it a picture she draws of a daisy. She shares it with her school, but not overwriting Alia's.
  • Alia corrects her spelling and uploads a new version of her document, overwriting the old one.
  • The teacher asks each student to write a report on a plant. Carlos chooses Daisies, and starts with Bea's document. He ends up deleting all the text but keeping the picture. Carlos is shy and doesn't share is report generally, he gives it directly to the teacher.
  • The teacher likes Carlos's report and publishes it to the school.
  • Later, the teacher notices that the localized wiki/wikipedia (Portuguese, say) doesn't have a page on daisies. She decides on a class project to start with Carlos's report to make a page.
  • She sets up a project space for children to contribute content. She wants everyone to put all their ideas and comments together.
  • Some students take pictures of daisies.
  • Delores writes an article on making a daisychain garland.
  • Estacio writes about where daisies grow in Brazil.
  • Fidelio translates some of the Spanish wikipedia page on daisies.
  • Gia finds some pictures of daisies online. Some are CC licensed, some aren't licensed at all, some are non-free stock photography images. She copies them all into the project space.
  • In class they try to bring everyone's material together for a single article.
  • The teacher doesn't worry about licensing and some non-free images go into the main document.
  • The teacher likes Delores' article on daisychains, but it isn't really about daisies. She doesn't want it forgotten, but doesn't include it. She moves it out of the project and makes a link from the main article. The link points to the school space.
  • She includes Estacio's and Fidelio's work as separate sections in the main document.
  • The teacher doesn't actually "do" all of these things, but directs the students to do them together. Communication takes place (via voice) in the classroom, but the content production on the laptops. They are all directly and well connected to the school server during this process.
  • After class the teacher gets the document into the right place on the Portuguese Wikipedia. She shares the link with all the students. She is proud that they put this together and would like to make sure all the parents see it... (how can she remind children to tell their parents?)

Asides

  1. Many parents don't know what Wikipedia is. The teacher would like to explain a little about it for the parents. She'd like to leave parent-directed comments [on the Wikipedia page?]. They aren't relevant to a larger audience... she wants to make sure parents know that it's something she wrote for them to read (as opposed to all the general stuff on Wikipedia).
  2. What happened to the daisychain article? (The teacher may have forgotten about it by now. Is it on the school repository? What happened to the link in the article?)
  3. A Wikipedia member (not a child, not using the laptop) removes all the unlicensed or non-free images in the article. That person would like to explain the reason to the contributor (who is attributed in the history tracked by the content repository, but not on Wikipedia, and has no WP page).

Read-only Use Case

  • Curriculum is developed. The curriculum consists of a set of documents, with outlines, goals, etc. There is no interactive component to the curriculum (as far as the laptop is concerned -- the curriculum may include exercises, etc., but this is not guided programmatically in any way.
  • Curriculum consists of a series of documents, including documents that may be "asides", that is they don't fit into any linearization of the curriculum.
  • Curriculum links to an indefinite number of non-core items. Some of these might be highly interlinked and large sets of data. E.g., bibliographic links, links to Wikipedia, links to library category listings. These sets of data are from a practical sense infinite -- there is no way to enumerate or fetch the full set of documents, and no clear boundaries that would define a cohesive or consistent set of documents.
  • Curriculum is viewable in the browser. Non-HTML and non-Image content is typically either viewed through plugins or as linked documents that are launched in some non-browser viewer.
  • The teacher or school identifies the curriculum and schedules it for use in the classroom.
  • The teacher indicates this in some way to the school server and/or individual laptops (perhaps by instructing students to follow some set of procedures). If the school server goes offline the core content for the curriculum should be available so classes can continue as scheduled. If laptops are not able to connect to the school server intermittently (e.g., when students are at home) the laptop should have the content or a strategy for getting the content.
  • The teacher has additional content not intended for the students (e.g., further background material).
  • Some strategy should be available to browse the non-core content that is not pre-fetched. If a student is interested in this content while they are not connected, there should be some way that they can revisit the content later when it is available.