Content repositories: Difference between revisions

From OLPC
Jump to navigation Jump to search
(..)
Line 1: Line 1:
{{content-nav}}
{{content-nav}}
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries in a number of languages.
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages.


(see also: [[Educational content ideas|content ideas]], [[sharing your content with OLPC]], and [[#Content rating|content rating]]).
We are gathering a list of those materials. To add to this any materials you know of which would be free for laptop users to access, see [[Educational content ideas|content ideas]].


* To share content if you are a publisher, author, or library, see [[Sharing your content with OLPC]].
* For more on rating or reviewing content for various audiences, see [[#Content rating|Content rating]].


== Subsets of large archives ==
== Large archives ==
* Free media collections : [http://www.flickr.com/ Flickr], [http://commons.wikimedia.org/ Wikimedia Commons]
There are some large archives available for inclusion in the content repository. [http://www.flickr.com/ Flickr] and [http://commons.wikimedia.org/ Wikimedia Commons]; [[Project Gutenberg]] and [http://scholar.google.com/ Google Scholar]; [http://www.wikipedia.org/ Wikipedia]; [http://www.wiktionaryz.org/ WiktionaryZ / OmegaWiki], Dicologos and [http://www.dicts.info/ Universal dictionary system]; the [http://humaninfo.org/ Humanity Development Library] and like collections; the list goes on.
* Music : [http://www.freesound.org Freesound], [http://reemusic.freeculture.org/ Free Music Project]
* Texts: [[Project Gutenberg]], [http://scholar.google.com/ Google Scholar], [http://www.wikipedia.org/ Wikipedia];
* Stories: [http://www.childrenslibrary.org ICDL], [http://schoollibrary.com/OLPC_Collection.htm other children's pdfs]
* Language: [http://www.wiktionaryz.org/ WiktionaryZ / OmegaWiki], Dicologos and [http://www.dicts.info/ Universal dictionary system]
* Reference collections: the [http://humaninfo.org/ Humanity Development Library], [www.widernet.org/digitallibrary/ eGranary]


There are tools being developed for identifying and culling subsets of large repositories. Small subsets will be needed for pre-installation of the choicest content on the laptops themselves. Larger subsets will still need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. All content (aside from images and media that are naturally alingual) will need some processing to make localization easier.
Tools are being developed for identifying and culling subsets of large repositories. Small subsets will be needed for pre-installation of choice material on the laptops themselves; larger subsets will still need curating to pick out material suitable for the laptop's audiences; classification and categorization; and
checks to avoid unbalance or repetition. Most content will need internationalization.


== Specific projects and collections ==
== Use cases ==
* The [[World Digital Library]] project
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school.
* [http://en.wikibooks.org/wiki/Wikijunior Wikijunior], [http://www.wikihow.com/Select-wikiHow-Articles-for-the-One-Laptop-Per-Child-Association WikiHow]
* OurStories project, with Story Corps, UNICEF, and Google - capturing local stories
* Book scanning and digitization:
*: Children's picturebooks, with support from ICDL
*: Public domain materials, with archival support from the Internet Archive
*: Other local cultural materials, with support from the World Digital Library


For an extended use case, see '''the [[Talk:Content repository#Extended use cases|talk page]]'''


== Proposed implementation for browsing ==
== Proposed implementation ==


The head page for a curriculum or cohesive set of content is just an HTML page. It can have any text interspersed in it. The document should have a tag in the header: <tt>&lt;link rel="olpc.content_bundle" href="sitemap.xml"&gt;.</tt>
The head page for a curriculum or cohesive set of content is just an HTML page. It can have any text interspersed in it. The document should have a tag in the header: <tt>&lt;link rel="olpc.content_bundle" href="sitemap.xml"&gt;.</tt>
Line 33: Line 42:
The content may link to other documents not enumerated in the sitemap. These may not be available, since they have not been prefetched. At that time the browser should offer to fetch the content when the laptop is able to find that content, and optionally notify the student of the availability of the content. The laptop may seek that content on other nearby laptops, the school server, or the wider internet. The content will be pre-fetched at that time. An option may be provided to do deeper pre-fetching (e.g., fetching down one line, or down two links into the content).
The content may link to other documents not enumerated in the sitemap. These may not be available, since they have not been prefetched. At that time the browser should offer to fetch the content when the laptop is able to find that content, and optionally notify the student of the availability of the content. The laptop may seek that content on other nearby laptops, the school server, or the wider internet. The content will be pre-fetched at that time. An option may be provided to do deeper pre-fetching (e.g., fetching down one line, or down two links into the content).


== Content rating ==
=== Use cases ===
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the ''' [[Talk:Content repository#Extended use cases|talk page]].
Some ideas for content rating:

* Develop a matrix of subject areas and reading levels, and rate available content in each matrix element
=== Content rating ===
* Set up a system where anyone can affiliate with a rating group and apply their shared style guidelines and ratings to existing pieces of material
Simple ratings can be done
* via a matrix of subject areas and reading levels,
* via a [[content stamping|group rating system]] where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials


[[Category:Developers]]
[[Category:Developers]]

Revision as of 21:46, 5 June 2007

Philosophy
Creating Content
Curating Content
Educational ideas
Activity ideas
Software ideas
Hardware ideas
Help Translating
Library
Content network
Repositories
Collections
modify 

By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages.

(see also: content ideas, sharing your content with OLPC, and content rating).


Large archives

Tools are being developed for identifying and culling subsets of large repositories. Small subsets will be needed for pre-installation of choice material on the laptops themselves; larger subsets will still need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. Most content will need internationalization.

Specific projects and collections

  • The World Digital Library project
  • Wikijunior, WikiHow
  • OurStories project, with Story Corps, UNICEF, and Google - capturing local stories
  • Book scanning and digitization:
    Children's picturebooks, with support from ICDL
    Public domain materials, with archival support from the Internet Archive
    Other local cultural materials, with support from the World Digital Library


Proposed implementation

The head page for a curriculum or cohesive set of content is just an HTML page. It can have any text interspersed in it. The document should have a tag in the header: <link rel="olpc.content_bundle" href="sitemap.xml">.

The sitemap is a Google Sitemap XML file, a simple enumeration of a set of URLs. Unlike Google's restrictions, the URLs do not have to live "on" the site where the sitemap is located -- they may cross domains. Embedded content (like images) do not have to be enumerated, but any linked content should be enumerated (for instance, if you link to a movie file from one of the documents).

Note that this head document can be constructed by anyone, and need not be hosted where the original material is located. Multiple head documents can refer to the same content, representing multiple versions of the curriculum, different target audiences, etc.

A document may contain multiple <link> tags, representing an aggregation of curricula. For instance, a teacher version of a curriculum would include the student version (the sitemap from that version) plus another sitemap enumerating all the documents intended just for the teacher.

The head page represents the collection. It may contain any text, and no special restrictions or interpretation is made of that text. The browser will detect this link tag, and when the student visits the page will offer to pre-fetch the entirety of the content. The pre-fetched content will appear to be at the same URL as it was originally, but will be served from the local cache. Additionally the school server may use this to cache data.

The student may manage their pre-fetched content, which takes up local space and may need to be purged. The head page and the head page's title represents the content in these situations.

The content may link to other documents not enumerated in the sitemap. These may not be available, since they have not been prefetched. At that time the browser should offer to fetch the content when the laptop is able to find that content, and optionally notify the student of the availability of the content. The laptop may seek that content on other nearby laptops, the school server, or the wider internet. The content will be pre-fetched at that time. An option may be provided to do deeper pre-fetching (e.g., fetching down one line, or down two links into the content).

Use cases

Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the talk page.

Content rating

Simple ratings can be done

  • via a matrix of subject areas and reading levels,
  • via a group rating system where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials