Content repositories: Difference between revisions

From OLPC
Jump to navigation Jump to search
(..)
 
(47 intermediate revisions by 22 users not shown)
Line 1: Line 1:
{{Translations}}
{{content-nav}}
{{content-nav}}
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages.
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages.
Line 4: Line 5:
(see also: [[Educational content ideas|content ideas]], [[sharing your content with OLPC]], and [[#Content rating|content rating]]).
(see also: [[Educational content ideas|content ideas]], [[sharing your content with OLPC]], and [[#Content rating|content rating]]).


== Potential materials ==
''for a breakdown with size estimates, see [[Content repository/sizes]]''

{|
|width="50%" valign="top"|
=== TV & Radio ===
:Educational shows (streamed): science, history, quiz, art/music, book review, story shows
:Bookreading / chapter a day audio
:News program (streamed)
:Audio streams (ogg)

=== Audio and videocasts ===
:Podcast streams
:Cache requests

=== News ===
:Local video
:BBC, M6 video
:UN news audio (en, es)
:International news
:International papers: IHT, LeMonde, El Mercurio
:Weather videos (short, regular)

=== Books and texts ===
:e-books, full format
:plain text ebooks
:reports and papers

|width="50%" valign="top"|
Premier Programming has a grant for non-profits. The Talking Word Processor is an amazing literacy tool for all students.
http://www.readingmadeez.com/education/grant.html
http://www.readingmadeez.com/products/talkingwordprocessor.html
Julia

=== Interactive courseware ===
:OCW collections, updates
:Science: biology, physics, chemistry
:Mathematics
:Music, Literature, History, Health
:Training: Sustainable ag, technology, repair, teaching, journalism

=== Wikipedia & sister projects ===
:Snapshots: simple, science, geography, languages, color, sound, environment, art, music
:Regular updates
:Wikibooks: wikijunior, fhsst [updated]

*[http://www.howtopedia.org www.howtopedia.org] a collaborative library for practical knowledge and simple technologies. [[howtopedia]] has a simple interface and clear content in English, French, Spanish.
<center> [[Image:Logo howtopedia.gif|300px]] </center>

=== Web and blogs ===
:Newsfeeds: rss from blogs, services, sites
:Web cache: individual and project-level requests

=== Mail, chat, file transfers ===
:Sharing mail and chat messages
:File uploads and repositories, local / school / global


|}

=== Daily updates ===

== Repository mirrors ==
We will have regional mirrors in every country and region where OLPC is distributed en masse.

Materials to be included in the mirrors include:
* a full Fedora repository
** developoment tools used by OLPC subteams in constructing builds
** cross-compilation toolchains used by various contributors
* a Free content repository
** Wikipedia snapshots [2-20GB], archive.org & gutenberg texts [2-100GB]
** Wikimedia Commons [1M resources, ~1TB]
** Other media resources (Jamendo, archive.org; size TBD)
* Collections made specifically for OLPC
** A collection sized for the XO
** Local collections sized for the XO and for school servers, developed in-country
** Collections sized for school servers (larger sets of data, maps, video, texts, and software)
* Software suitable for OLPC schools
** Packages for FC7 school servers
** Extensions and additions for existing tools and services in the OLPC network

=== Mirror sites ===
In general, the Internet Archive is considering launching local and regional mirrors of their collections. We will want a larger mirror that can host a dozen TB of data for each region, connected to multiple peers; and a number of smaller mirrors in each country.
Details on potential mirror hosts:

'''South America''':
* Brasil -- Google, AMD. See also Rodrigo M.
* Uruguay -- Google and GMail

'''Africa'''
* Nigeria --
* Ethiopia --

'''Asia'''
* Nepal -- see [[OLPC Nepal]]
* Taiwan?


== Large archives ==
== Large archives ==
* Free media collections : [http://www.flickr.com/ Flickr], [http://commons.wikimedia.org/ Wikimedia Commons]
* Free media collections : [http://www.flickr.com/ Flickr], [http://commons.wikimedia.org/ Wikimedia Commons]
* Music : [http://www.freesound.org Freesound], [http://reemusic.freeculture.org/ Free Music Project]
* Music : [http://www.freesound.org Freesound], [http://freemusic.freeculture.org/ Free Music Project]
* Texts: [[Project Gutenberg]], [http://scholar.google.com/ Google Scholar], [http://www.wikipedia.org/ Wikipedia];
* Texts: [[Project Gutenberg]], [http://scholar.google.com/ Google Scholar], [http://www.wikipedia.org/ Wikipedia];
* Stories: [http://www.childrenslibrary.org ICDL], [http://schoollibrary.com/OLPC_Collection.htm other children's pdfs]
* Stories: [http://www.childrenslibrary.org ICDL], [http://schoollibrary.com/OLPC_Collection.htm other children's pdfs]
* Language: [http://www.wiktionaryz.org/ WiktionaryZ / OmegaWiki], Dicologos and [http://www.dicts.info/ Universal dictionary system]
* Language: [http://www.wiktionaryz.org/ WiktionaryZ / OmegaWiki], Dicologos and [http://www.dicts.info/ Universal dictionary system]
* Reference collections: the [http://humaninfo.org/ Humanity Development Library], [www.widernet.org/digitallibrary/ eGranary]
* Reference collections: the [http://humaninfo.org/ Humanity Development Library], [http://www.widernet.org/digitallibrary/ eGranary]


Tools are being developed for identifying and culling subsets of large repositories. Small subsets will be needed for pre-installation of choice material on the laptops themselves; larger subsets will still need curating to pick out material suitable for the laptop's audiences; classification and categorization; and
Small subsets will need to be culled for pre-installation of choice material on the laptops themselves; larger subsets will need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. Most content needs internationalization.
checks to avoid unbalance or repetition. Most content will need internationalization.


== Specific projects and collections ==
== Specific projects and collections ==
* [[Avallain]] literacy and basic skills learning
* The [[World Digital Library]] project
* A [http://www.worlddigitallibrary.org/project/english/index.html World Digital Library] portal
* [http://en.wikibooks.org/wiki/Wikijunior Wikijunior], [http://www.wikihow.com/Select-wikiHow-Articles-for-the-One-Laptop-Per-Child-Association WikiHow]
* [http://en.wikibooks.org/wiki/Wikijunior Wikijunior], [http://www.wikihow.com/Select-wikiHow-Articles-for-the-One-Laptop-Per-Child-Association WikiHow]
* OurStories project, with Story Corps, UNICEF, and Google - capturing local stories
* [[Our Stories]] project, with Story Corps, UNICEF, and Google - capturing local stories
* Book scanning and digitization:
* Book scanning and digitization:
*: Children's picturebooks, with support from ICDL
*: Children's picturebooks, with support from ICDL
*: Public domain materials, with archival support from the Internet Archive
*: Public domain materials, with archival support from the Internet Archive
*: Other local cultural materials, with support from the World Digital Library
*: Other local cultural materials, with support from the World Digital Library
* [http://wikieducator.org Wikieducator] tutorials
* OER subcollections from [http://curriki.org Curriki] and [http://oercommons.org OER Commons]
* [[Health_Content]] collection
* The [[Open Training Platform]] Hub to free learning resources for development, advocating for open non formal educational content powered by UNESCO [http://www.opentrainingplatform.org]
* [http://owl.english.purdue.edu/ Purdue University Online Writing Lab (OWL)]
* [http://www.earthlearningidea.com/ Earth Learning Idea] Earth-related teaching ideas - all free to download. These require minimal equipment and resources, suitable for all ages and for teachers of science and geography.
* [http://www.literacycenter.net Literacy Center] - letter and number recognition in Flash and print formats
* [http://www.rechenrad.de mathmatics for little ones] German language flash for preschool


==Copyright-restricted content==


To list a content collection that would be nice to have, but that is not currently under appropriate licensing for use on OLPC go to [[Licensing petitions]].
== Proposed implementation ==

The head page for a curriculum or cohesive set of content is just an HTML page. It can have any text interspersed in it. The document should have a tag in the header: <tt>&lt;link rel="olpc.content_bundle" href="sitemap.xml"&gt;.</tt>

The sitemap is a [https://www.google.com/webmasters/tools/docs/en/protocol.html Google Sitemap XML file], a simple enumeration of a set of URLs. Unlike Google's restrictions, the URLs do not have to live "on" the site where the sitemap is located -- they may cross domains. Embedded content (like images) do not have to be enumerated, but any linked content should be enumerated (for instance, if you link to a movie file from one of the documents).

Note that this head document can be constructed by anyone, and need not be hosted where the original material is located. Multiple head documents can refer to the same content, representing multiple versions of the curriculum, different target audiences, etc.

A document may contain multiple <tt>&lt;link&gt;</tt> tags, representing an aggregation of curricula. For instance, a teacher version of a curriculum would include the student version (the sitemap from that version) plus another sitemap enumerating all the documents intended just for the teacher.

The head page represents the collection. It may contain any text, and no special restrictions or interpretation is made of that text. The browser will detect this link tag, and when the student visits the page will offer to pre-fetch the entirety of the content. The pre-fetched content will appear to be at the same URL as it was originally, but will be served from the local cache. Additionally the school server may use this to cache data.

The student may manage their pre-fetched content, which takes up local space and may need to be purged. The head page and the head page's title represents the content in these situations.

The content may link to other documents not enumerated in the sitemap. These may not be available, since they have not been prefetched. At that time the browser should offer to fetch the content when the laptop is able to find that content, and optionally notify the student of the availability of the content. The laptop may seek that content on other nearby laptops, the school server, or the wider internet. The content will be pre-fetched at that time. An option may be provided to do deeper pre-fetching (e.g., fetching down one line, or down two links into the content).


=== Use cases ===
== Use cases ==
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the ''' [[Talk:Content repository#Extended use cases|talk page]].
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the ''' [[Talk:Content repository#Extended use cases|talk page]].


=== Content rating ===
== Content rating ==
Simple ratings can be done
Simple ratings can be done
* via a matrix of subject areas and reading levels,
* via a matrix of subject areas and reading levels,
* via a [[content stamping|group rating system]] where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials
* via a [[content stamping|group rating system]] where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials


[[Category:Developers]]
[[Category:Resources]]
[[Category:Resources]]
[[Category:Pedagogical ideas]]
[[Category:Pedagogical ideas]]
[[Category:Content Repository]]

Latest revision as of 04:37, 23 September 2009

  english | 한국어 HowTo [ID# 218531]  +/-  
Philosophy
Creating Content
Curating Content
Educational ideas
Activity ideas
Software ideas
Hardware ideas
Help Translating
Library
Content network
Repositories
Collections
modify 

By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages.

(see also: content ideas, sharing your content with OLPC, and content rating).

Potential materials

for a breakdown with size estimates, see Content repository/sizes

TV & Radio

Educational shows (streamed): science, history, quiz, art/music, book review, story shows
Bookreading / chapter a day audio
News program (streamed)
Audio streams (ogg)

Audio and videocasts

Podcast streams
Cache requests

News

Local video
BBC, M6 video
UN news audio (en, es)
International news
International papers: IHT, LeMonde, El Mercurio
Weather videos (short, regular)

Books and texts

e-books, full format
plain text ebooks
reports and papers

Premier Programming has a grant for non-profits. The Talking Word Processor is an amazing literacy tool for all students. http://www.readingmadeez.com/education/grant.html http://www.readingmadeez.com/products/talkingwordprocessor.html Julia

Interactive courseware

OCW collections, updates
Science: biology, physics, chemistry
Mathematics
Music, Literature, History, Health
Training: Sustainable ag, technology, repair, teaching, journalism

Wikipedia & sister projects

Snapshots: simple, science, geography, languages, color, sound, environment, art, music
Regular updates
Wikibooks: wikijunior, fhsst [updated]
  • www.howtopedia.org a collaborative library for practical knowledge and simple technologies. howtopedia has a simple interface and clear content in English, French, Spanish.
Logo howtopedia.gif

Web and blogs

Newsfeeds: rss from blogs, services, sites
Web cache: individual and project-level requests

Mail, chat, file transfers

Sharing mail and chat messages
File uploads and repositories, local / school / global


Daily updates

Repository mirrors

We will have regional mirrors in every country and region where OLPC is distributed en masse.

Materials to be included in the mirrors include:

  • a full Fedora repository
    • developoment tools used by OLPC subteams in constructing builds
    • cross-compilation toolchains used by various contributors
  • a Free content repository
    • Wikipedia snapshots [2-20GB], archive.org & gutenberg texts [2-100GB]
    • Wikimedia Commons [1M resources, ~1TB]
    • Other media resources (Jamendo, archive.org; size TBD)
  • Collections made specifically for OLPC
    • A collection sized for the XO
    • Local collections sized for the XO and for school servers, developed in-country
    • Collections sized for school servers (larger sets of data, maps, video, texts, and software)
  • Software suitable for OLPC schools
    • Packages for FC7 school servers
    • Extensions and additions for existing tools and services in the OLPC network

Mirror sites

In general, the Internet Archive is considering launching local and regional mirrors of their collections. We will want a larger mirror that can host a dozen TB of data for each region, connected to multiple peers; and a number of smaller mirrors in each country. Details on potential mirror hosts:

South America:

  • Brasil -- Google, AMD. See also Rodrigo M.
  • Uruguay -- Google and GMail

Africa

  • Nigeria --
  • Ethiopia --

Asia

Large archives

Small subsets will need to be culled for pre-installation of choice material on the laptops themselves; larger subsets will need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. Most content needs internationalization.

Specific projects and collections

Copyright-restricted content

To list a content collection that would be nice to have, but that is not currently under appropriate licensing for use on OLPC go to Licensing petitions.

Use cases

Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the talk page.

Content rating

Simple ratings can be done

  • via a matrix of subject areas and reading levels,
  • via a group rating system where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials