Content repositories: Difference between revisions
No edit summary |
|||
(50 intermediate revisions by 23 users not shown) | |||
Line 1: | Line 1: | ||
{{Translations}} |
|||
{{content-nav}} |
{{content-nav}} |
||
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries in a number of languages. |
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages. |
||
(see also: [[Educational content ideas|content ideas]], [[sharing your content with OLPC]], and [[#Content rating|content rating]]). |
|||
We are gathering a list of those materials. To add to this any materials you know of which would be free for laptop users to access, see [[Educational content ideas|content ideas]]. |
|||
== Potential materials == |
|||
* To share content if you are a publisher, author, or library, see [[Sharing your content with OLPC]]. |
|||
''for a breakdown with size estimates, see [[Content repository/sizes]]'' |
|||
* For more on rating or reviewing content for various audiences, see [[#Content rating|Content rating]]. |
|||
{| |
|||
== Subsets of large archives == |
|||
|width="50%" valign="top"| |
|||
There are some large archives available for inclusion in the content repository. [http://www.flickr.com/ Flickr] and [http://commons.wikimedia.org/ Wikimedia Commons]; [[Project Gutenberg]] and [http://scholar.google.com/ Google Scholar]; [http://www.wikipedia.org/ Wikipedia]; [http://www.wiktionaryz.org/ WiktionaryZ / OmegaWiki] and Dicologos; the [http://humaninfo.org/ Humanity Development Library] and like collections; the list goes on. |
|||
=== TV & Radio === |
|||
:Educational shows (streamed): science, history, quiz, art/music, book review, story shows |
|||
:Bookreading / chapter a day audio |
|||
:News program (streamed) |
|||
:Audio streams (ogg) |
|||
=== Audio and videocasts === |
|||
There are tools being developed for identifying and culling subsets of large repositories. Small subsets will be needed for pre-installation of the choicest content on the laptops themselves. Larger subsets will still need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. All content (aside from images and media that are naturally alingual) will need some processing to make localization easier. |
|||
:Podcast streams |
|||
:Cache requests |
|||
== |
=== News === |
||
:Local video |
|||
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. |
|||
:BBC, M6 video |
|||
:UN news audio (en, es) |
|||
:International news |
|||
:International papers: IHT, LeMonde, El Mercurio |
|||
:Weather videos (short, regular) |
|||
=== Books and texts === |
|||
:e-books, full format |
|||
:plain text ebooks |
|||
:reports and papers |
|||
|width="50%" valign="top"| |
|||
=== Extended use case #1 === |
|||
Premier Programming has a grant for non-profits. The Talking Word Processor is an amazing literacy tool for all students. |
|||
A teacher proposes a collaborative writing project for her class, and later tries to integrate the students' work with Wikipedia. Note the general issues of merging collaborations, and of integration of revision histories from the repository with outside repositories. |
|||
http://www.readingmadeez.com/education/grant.html |
|||
http://www.readingmadeez.com/products/talkingwordprocessor.html |
|||
Julia |
|||
=== Interactive courseware === |
|||
* Alia writes a document about daisies -- about how she thinks they are pretty and she likes picking them for her mother. She shares this on the school repository. It has no general interest. |
|||
:OCW collections, updates |
|||
:Science: biology, physics, chemistry |
|||
:Mathematics |
|||
:Music, Literature, History, Health |
|||
:Training: Sustainable ag, technology, repair, teaching, journalism |
|||
=== Wikipedia & sister projects === |
|||
* Bea (Alia's friend) gets the document and adds to it a picture she draws of a daisy. She shares it with her school, but not overwriting Alia's. |
|||
:Snapshots: simple, science, geography, languages, color, sound, environment, art, music |
|||
:Regular updates |
|||
:Wikibooks: wikijunior, fhsst [updated] |
|||
*[http://www.howtopedia.org www.howtopedia.org] a collaborative library for practical knowledge and simple technologies. [[howtopedia]] has a simple interface and clear content in English, French, Spanish. |
|||
* Alia corrects her spelling and uploads a new version of her document, overwriting the old one. |
|||
<center> [[Image:Logo howtopedia.gif|300px]] </center> |
|||
=== Web and blogs === |
|||
* The teacher asks each student to write a report on a plant. Carlos chooses Daisies, and starts with Bea's document. He ends up deleting all the text but keeping the picture. Carlos is shy and doesn't share is report generally, he gives it directly to the teacher. |
|||
:Newsfeeds: rss from blogs, services, sites |
|||
:Web cache: individual and project-level requests |
|||
=== Mail, chat, file transfers === |
|||
* The teacher likes Carlos's report and publishes it to the school. |
|||
:Sharing mail and chat messages |
|||
:File uploads and repositories, local / school / global |
|||
* Later, the teacher notices that the localized wiki/wikipedia (Portuguese, say) doesn't have a page on daisies. She decides on a class project to start with Carlos's report to make a page. |
|||
|} |
|||
* She sets up a project space for children to contribute content. She wants everyone to put all their ideas and comments together. |
|||
=== Daily updates === |
|||
* Some students take pictures of daisies. |
|||
== Repository mirrors == |
|||
* Delores writes an article on making a daisychain garland. |
|||
We will have regional mirrors in every country and region where OLPC is distributed en masse. |
|||
Materials to be included in the mirrors include: |
|||
* Estacio writes about where daisies grow in Brazil. |
|||
* a full Fedora repository |
|||
** developoment tools used by OLPC subteams in constructing builds |
|||
** cross-compilation toolchains used by various contributors |
|||
* a Free content repository |
|||
** Wikipedia snapshots [2-20GB], archive.org & gutenberg texts [2-100GB] |
|||
** Wikimedia Commons [1M resources, ~1TB] |
|||
** Other media resources (Jamendo, archive.org; size TBD) |
|||
* Collections made specifically for OLPC |
|||
** A collection sized for the XO |
|||
** Local collections sized for the XO and for school servers, developed in-country |
|||
** Collections sized for school servers (larger sets of data, maps, video, texts, and software) |
|||
* Software suitable for OLPC schools |
|||
** Packages for FC7 school servers |
|||
** Extensions and additions for existing tools and services in the OLPC network |
|||
=== Mirror sites === |
|||
* Fidelio translates some of the Spanish wikipedia page on daisies. |
|||
In general, the Internet Archive is considering launching local and regional mirrors of their collections. We will want a larger mirror that can host a dozen TB of data for each region, connected to multiple peers; and a number of smaller mirrors in each country. |
|||
Details on potential mirror hosts: |
|||
'''South America''': |
|||
* Gia finds some pictures of daisies online. Some are CC licensed, some aren't licensed at all, some are non-free stock photography images. She copies them all into the project space. |
|||
* Brasil -- Google, AMD. See also Rodrigo M. |
|||
* Uruguay -- Google and GMail |
|||
'''Africa''' |
|||
* In class they try to bring everyone's material together for a single article. |
|||
* Nigeria -- |
|||
* Ethiopia -- |
|||
'''Asia''' |
|||
* The teacher doesn't worry about licensing and some non-free images go into the main document. |
|||
* Nepal -- see [[OLPC Nepal]] |
|||
* Taiwan? |
|||
== Large archives == |
|||
* The teacher likes Delores' article on daisychains, but it isn't really about daisies. She doesn't want it forgotten, but doesn't include it. She moves it out of the project and makes a link from the main article. The link points to the school space. |
|||
* Free media collections : [http://www.flickr.com/ Flickr], [http://commons.wikimedia.org/ Wikimedia Commons] |
|||
* Music : [http://www.freesound.org Freesound], [http://freemusic.freeculture.org/ Free Music Project] |
|||
* Texts: [[Project Gutenberg]], [http://scholar.google.com/ Google Scholar], [http://www.wikipedia.org/ Wikipedia]; |
|||
* Stories: [http://www.childrenslibrary.org ICDL], [http://schoollibrary.com/OLPC_Collection.htm other children's pdfs] |
|||
* Language: [http://www.wiktionaryz.org/ WiktionaryZ / OmegaWiki], Dicologos and [http://www.dicts.info/ Universal dictionary system] |
|||
* Reference collections: the [http://humaninfo.org/ Humanity Development Library], [http://www.widernet.org/digitallibrary/ eGranary] |
|||
Small subsets will need to be culled for pre-installation of choice material on the laptops themselves; larger subsets will need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. Most content needs internationalization. |
|||
* She includes Estacio's and Fidelio's work as separate sections in the main document. |
|||
== Specific projects and collections == |
|||
* The teacher doesn't actually "do" all of these things, but directs the students to do them together. Communication takes place (via voice) in the classroom, but the content production on the laptops. They are all directly and well connected to the school server during this process. |
|||
* [[Avallain]] literacy and basic skills learning |
|||
* A [http://www.worlddigitallibrary.org/project/english/index.html World Digital Library] portal |
|||
* [http://en.wikibooks.org/wiki/Wikijunior Wikijunior], [http://www.wikihow.com/Select-wikiHow-Articles-for-the-One-Laptop-Per-Child-Association WikiHow] |
|||
* [[Our Stories]] project, with Story Corps, UNICEF, and Google - capturing local stories |
|||
* Book scanning and digitization: |
|||
*: Children's picturebooks, with support from ICDL |
|||
*: Public domain materials, with archival support from the Internet Archive |
|||
*: Other local cultural materials, with support from the World Digital Library |
|||
* [http://wikieducator.org Wikieducator] tutorials |
|||
* OER subcollections from [http://curriki.org Curriki] and [http://oercommons.org OER Commons] |
|||
* [[Health_Content]] collection |
|||
* The [[Open Training Platform]] Hub to free learning resources for development, advocating for open non formal educational content powered by UNESCO [http://www.opentrainingplatform.org] |
|||
* [http://owl.english.purdue.edu/ Purdue University Online Writing Lab (OWL)] |
|||
* [http://www.earthlearningidea.com/ Earth Learning Idea] Earth-related teaching ideas - all free to download. These require minimal equipment and resources, suitable for all ages and for teachers of science and geography. |
|||
* [http://www.literacycenter.net Literacy Center] - letter and number recognition in Flash and print formats |
|||
* [http://www.rechenrad.de mathmatics for little ones] German language flash for preschool |
|||
==Copyright-restricted content== |
|||
* After class the teacher gets the document into the right place on the Portuguese Wikipedia. She shares the link with all the students. She is proud that they put this together and would like to make sure all the parents see it... (how can she remind children to tell their parents?) |
|||
To list a content collection that would be nice to have, but that is not currently under appropriate licensing for use on OLPC go to [[Licensing petitions]]. |
|||
'''Asides''' |
|||
# Many parents don't know what Wikipedia is. The teacher would like to explain a little about it for the parents. She'd like to leave parent-directed comments [on the Wikipedia page?]. They aren't relevant to a larger audience... she wants to make sure parents know that it's something she wrote for them to read (as opposed to all the general stuff on Wikipedia). |
|||
# What happened to the daisychain article? (The teacher may have forgotten about it by now. Is it on the school repository? What happened to the link in the article?) |
|||
# A Wikipedia member (not a child, not using the laptop) removes all the unlicensed or non-free images in the article. That person would like to explain the reason to the contributor (who is attributed in the history tracked by the content repository, but not on Wikipedia, and has no WP page). |
|||
== Use cases == |
|||
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the ''' [[Talk:Content repository#Extended use cases|talk page]]. |
|||
* Curriculum is developed. The curriculum consists of a set of documents, with outlines, goals, etc. There is no interactive component to the curriculum (as far as the laptop is concerned -- the curriculum may include exercises, etc., but this is not guided programmatically in any way. |
|||
* Curriculum consists of a series of documents, including documents that may be "asides", that is they don't fit into any linearization of the curriculum. |
|||
* Curriculum links to an indefinite number of non-core items. Some of these might be highly interlinked and large sets of data. E.g., bibliographic links, links to Wikipedia, links to library category listings. These sets of data are from a practical sense infinite -- there is no way to enumerate or fetch the full set of documents, and no clear boundaries that would define a cohesive or consistent set of documents. |
|||
* Curriculum is viewable in the browser. Non-HTML and non-Image content is typically either viewed through plugins or as linked documents that are launched in some non-browser viewer. |
|||
* The teacher or school identifies the curriculum and schedules it for use in the classroom. |
|||
* The teacher indicates this in some way to the school server and/or individual laptops (perhaps by instructing students to follow some set of procedures). If the school server goes offline the core content for the curriculum should be available so classes can continue as scheduled. If laptops are not able to connect to the school server intermittently (e.g., when students are at home) the laptop should have the content or a strategy for getting the content. |
|||
* The teacher has additional content not intended for the students (e.g., further background material). |
|||
* Some strategy should be available to browse the non-core content that is not pre-fetched. If a student is interested in this content while they are not connected, there should be some way that they can revisit the content later when it is available. |
|||
=== Pre-fetching and Search Use Case === |
|||
* A student (or class or teacher) is interested in some particular topic. For instance, content related to meerkats. |
|||
* An online search form provides a way to find existing content around that topic. For instance, [http://images.google.com/images?q=meerkat&hl=en&lr=&safe=off&sa=X&oi=images&ct=title meerkat images on Google]. |
|||
* The student is also interested in content yet to be created about the topic. They would like to find out about new search results. |
|||
* The student has poor internet connectivity. It is intermittent and also fairly slow. |
|||
* Issue: how can they move forward on their interest while they are not connected? Can they formulate and save that search in some way? |
|||
* Issue: connectivity may occur when the student is not attending to this interest, for instance while in school studying math. We want to use this potential connectivity without requiring any attention from the student. |
|||
* Issue: some content may be very large, like a movie about meerkats. The student may not be connected to the internet long enough at any time to fetch the movie (especially if the internet connection is slow). The school server may be connected long enough, and the student may be connected long enough to the school server to get the content. |
|||
* Issue: contention for space and bandwidth. Is the student's interest a priority? How can the student manage their limited laptop space? Some content may be too large to ever host in its entirety on the laptop. How can the school server fulfill its potential as an intermediary, both when dealing with widely popular content (e.g., something used specifically for classroom group activities) and for specialized content (one child with a specialized interest). |
|||
== Proposed implementation for browsing == |
|||
The head page for a curriculum or cohesive set of content is just an HTML page. It can have any text interspersed in it. The document should have a tag in the header: <tt><link rel="olpc.content_bundle" href="sitemap.xml">.</tt> |
|||
The sitemap is a [https://www.google.com/webmasters/tools/docs/en/protocol.html Google Sitemap XML file], a simple enumeration of a set of URLs. Unlike Google's restrictions, the URLs do not have to live "on" the site where the sitemap is located -- they may cross domains. Embedded content (like images) do not have to be enumerated, but any linked content should be enumerated (for instance, if you link to a movie file from one of the documents). |
|||
Note that this head document can be constructed by anyone, and need not be hosted where the original material is located. Multiple head documents can refer to the same content, representing multiple versions of the curriculum, different target audiences, etc. |
|||
A document may contain multiple <tt><link></tt> tags, representing an aggregation of curricula. For instance, a teacher version of a curriculum would include the student version (the sitemap from that version) plus another sitemap enumerating all the documents intended just for the teacher. |
|||
The head page represents the collection. It may contain any text, and no special restrictions or interpretation is made of that text. The browser will detect this link tag, and when the student visits the page will offer to pre-fetch the entirety of the content. The pre-fetched content will appear to be at the same URL as it was originally, but will be served from the local cache. Additionally the school server may use this to cache data. |
|||
The student may manage their pre-fetched content, which takes up local space and may need to be purged. The head page and the head page's title represents the content in these situations. |
|||
The content may link to other documents not enumerated in the sitemap. These may not be available, since they have not been prefetched. At that time the browser should offer to fetch the content when the laptop is able to find that content, and optionally notify the student of the availability of the content. The laptop may seek that content on other nearby laptops, the school server, or the wider internet. The content will be pre-fetched at that time. An option may be provided to do deeper pre-fetching (e.g., fetching down one line, or down two links into the content). |
|||
== Content rating == |
== Content rating == |
||
Simple ratings can be done |
|||
Some ideas for content rating: |
|||
* |
* via a matrix of subject areas and reading levels, |
||
* |
* via a [[content stamping|group rating system]] where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials |
||
[[Category:Developers]] |
|||
[[Category:Resources]] |
[[Category:Resources]] |
||
[[Category:Pedagogical ideas]] |
[[Category:Pedagogical ideas]] |
||
[[Category:Content Repository]] |
Latest revision as of 04:37, 23 September 2009
Philosophy |
Creating Content |
Curating Content |
Educational ideas |
Activity ideas |
Software ideas |
Hardware ideas |
Help Translating |
Library |
Content network |
Repositories |
Collections |
modify |
By distributing laptops and school servers with learning materials on them, and a global index of content that can be used with no modification on the laptops, OLPC is developing a network of digital libraries and collections in a number of languages.
(see also: content ideas, sharing your content with OLPC, and content rating).
Potential materials
for a breakdown with size estimates, see Content repository/sizes
TV & Radio
Audio and videocasts
News
Books and texts
|
Premier Programming has a grant for non-profits. The Talking Word Processor is an amazing literacy tool for all students. http://www.readingmadeez.com/education/grant.html http://www.readingmadeez.com/products/talkingwordprocessor.html Julia Interactive courseware
Wikipedia & sister projects
Web and blogs
Mail, chat, file transfers
|
Daily updates
Repository mirrors
We will have regional mirrors in every country and region where OLPC is distributed en masse.
Materials to be included in the mirrors include:
- a full Fedora repository
- developoment tools used by OLPC subteams in constructing builds
- cross-compilation toolchains used by various contributors
- a Free content repository
- Wikipedia snapshots [2-20GB], archive.org & gutenberg texts [2-100GB]
- Wikimedia Commons [1M resources, ~1TB]
- Other media resources (Jamendo, archive.org; size TBD)
- Collections made specifically for OLPC
- A collection sized for the XO
- Local collections sized for the XO and for school servers, developed in-country
- Collections sized for school servers (larger sets of data, maps, video, texts, and software)
- Software suitable for OLPC schools
- Packages for FC7 school servers
- Extensions and additions for existing tools and services in the OLPC network
Mirror sites
In general, the Internet Archive is considering launching local and regional mirrors of their collections. We will want a larger mirror that can host a dozen TB of data for each region, connected to multiple peers; and a number of smaller mirrors in each country. Details on potential mirror hosts:
South America:
- Brasil -- Google, AMD. See also Rodrigo M.
- Uruguay -- Google and GMail
Africa
- Nigeria --
- Ethiopia --
Asia
- Nepal -- see OLPC Nepal
- Taiwan?
Large archives
- Free media collections : Flickr, Wikimedia Commons
- Music : Freesound, Free Music Project
- Texts: Project Gutenberg, Google Scholar, Wikipedia;
- Stories: ICDL, other children's pdfs
- Language: WiktionaryZ / OmegaWiki, Dicologos and Universal dictionary system
- Reference collections: the Humanity Development Library, eGranary
Small subsets will need to be culled for pre-installation of choice material on the laptops themselves; larger subsets will need curating to pick out material suitable for the laptop's audiences; classification and categorization; and checks to avoid unbalance or repetition. Most content needs internationalization.
Specific projects and collections
- Avallain literacy and basic skills learning
- A World Digital Library portal
- Wikijunior, WikiHow
- Our Stories project, with Story Corps, UNICEF, and Google - capturing local stories
- Book scanning and digitization:
- Children's picturebooks, with support from ICDL
- Public domain materials, with archival support from the Internet Archive
- Other local cultural materials, with support from the World Digital Library
- Wikieducator tutorials
- OER subcollections from Curriki and OER Commons
- Health_Content collection
- The Open Training Platform Hub to free learning resources for development, advocating for open non formal educational content powered by UNESCO [1]
- Purdue University Online Writing Lab (OWL)
- Earth Learning Idea Earth-related teaching ideas - all free to download. These require minimal equipment and resources, suitable for all ages and for teachers of science and geography.
- Literacy Center - letter and number recognition in Flash and print formats
- mathmatics for little ones German language flash for preschool
Copyright-restricted content
To list a content collection that would be nice to have, but that is not currently under appropriate licensing for use on OLPC go to Licensing petitions.
Use cases
Conceptually, a content repository could be used in a variety of ways: to publish and share new material, to collaborate on material development over time (synchronizing online and offline contributions to a shared document or project), to search for and download material, to distribute and cache from large collections that can't be contained on one machine or at one school. For more, see the talk page.
Content rating
Simple ratings can be done
- via a matrix of subject areas and reading levels,
- via a group rating system where anyone can affiliate with a rating group and apply their shared guidelines and ratings to materials