Content workflow
NOTE: The contents of this page are not set in stone, and are subject to change! This page is a draft in active flux ... |
Content Workflow
Hundreds of thousands of logical collections of media exist that could be in bundles for the XO. OLPC has countless hundreds of volunteers who want to help. But relatively few bundles have been created, why?
The answer is roughly that the process for creating a bundle is not clear. Also, not everyone is capable of completing all of the technical or organizational tasks needed to make a bundle. It makes sense to complete document the steps of this process and try to flesh out tools, processes and even groups to facilitate this process. It is not and has never been a hard task, just a few little ones.
Content Workflow Diagram
Found Content------License Vetting | | |----Translate---- Search for Content | | | | | | | | | Add Content to Wiki Initial Review/Prioritize-------Transcode--------Content Bundling--Content Stamping | | | | | Wanted Content | | | | | | |----Edit--------- Created Content----License Release
Steps
The above diagram is a little rough, so lets walk through the steps in order.
Raw content
The process starts with raw content from a source. This could have been created specifically for OLPC or re-purposed from any one of thousands of existing content collections. There are a number of steps that have to be done to this content before it can be made into a library bundle.
Document the content
The first step is to add the content to the wiki. This is an important first step for many reasons:
- Its unlikely that this way that two people will duplicate effort
- If you (the finder or creator) stop working on the project, the work can be picked up by someone else
This initial documentation should contain some basic information:
Name of Collection | Filetype(s) | Language | URL to Content | Submitted by | License |
---|---|---|---|---|---|
Diveintopython | http, pdf, xml, | it, fr, es, cz, kr, ru, | http://diveintopython.org/ | Seth | GFDL |
This example for Dive into Python isn't a bad example of what this data should look like. It doesn't quite satisfy all of the information that is needed but it is a start.
Check the License
One of the most important pieces of information would be the current license status of the work. Content for OLPC must be in some acceptable form of Open-Content license like Creative Commons or the GFDL. Content that has an unknown status, or is of unacceptable license should a special tag and/or place on the wiki.
A group of users need to watch this page and track down of copyright information on content, manage correct attribution, and in some cases, opening discussions with the copyright holder on releasing the content into an acceptable license.
Format
At this point the material needs to be cleaned up for loading on the XO. A lot of these steps don't have to happen, but ideally we send kids the best possible and most useful content for it's size.
Edit
Simple enough, the material must be edited and shaped to fit on the XO. This includes resizing images for the screen of the XO, fixing spelling errors, etc.
Transcode
A lot of material needs to be Transcoded. Trancoding is the conversion from one filetype to another. For instance media that is in mp3 will need to be trancoded to ogg, and whenever possible flash video needs to be transcoded to ogg video.
Translate and Localize
Here comes the hard part. Most, if not all, or our content needs to be translated if it is going to be useful in our deployment countries. See Translate for details.
Bundling
Examples
Examples
Photos
- I am a non-profit with a large digital stock of scanned historical maps from Africa
- I have a series of photos of puppets from around the world
Internet Archive
www.archive.org has hundreds of thousands of scanned-in books -- 50,000 live concerts -- hundreds of thousands of movies, animations and videos -- all freely licensed or in the public domain. They are actively interested in transcoding all their videos into Ogg Theora so they can be played on the XO. Almost all of the music is in lossless formats, so it would transcode well into Ogg Vorbis or into Speex.
nasaimages.com has thousands of photos in the public domain, taken of space or by space or about space and rockets and stuff like that. Also some movies. It's managed by the Internet Archive in collaboration with NASA.
Vetting of Content
Depending on how much information was recorded by the content's finder and initial logger to the wiki, additional information will likely need to be recorded.
Cross Posting
The last step involves cross posting any materials to other websites that the content may belong on. For instance an audiobook could go to archive.org or to librivox. An illustration would be uploaded to Wiki Commons.