Bundle metadata: Difference between revisions

From OLPC
Jump to navigation Jump to search
("collections" seems to be the preferred term over "content bundles")
 
(5 intermediate revisions by 4 users not shown)
Line 1: Line 1:
OLPC bundles (currently [[activity bundles]] and [[content bundles]], though [[bundles|others]] have been discussed from time to time) contain a set of metadata in a <tt>.info</tt> file. Currently a slightly different set of data is gathered for [[activity bundles]] and [[sample library.info file|library bundles]].
OLPC bundles (currently [[activity bundles]] and [[collections]], though [[bundles|others]] have been discussed from time to time) contain a set of metadata in a <tt>.info</tt> file. Currently a slightly different set of data is gathered for each type (see a [[sample library.info file]] for a collection).


A separate set of metadata again is gathered via specfiles for rpms that are included in latest builds. So to include your latest version of an activity, the maintainer would need to
A separate set of metadata again is gathered via specfiles for rpms that are included in latest builds. So to include your latest version of an activity, the maintainer would need to
Line 26: Line 26:
* <font color=blue>name</font> - string
* <font color=blue>name</font> - string
* <font color=blue>long_name</font> - or better "pretty_name", for display (could be shorter than <tt>name</tt>
* <font color=blue>long_name</font> - or better "pretty_name", for display (could be shorter than <tt>name</tt>
* <font color=red>version</font> - int.int
* <font color=blue>mime_types</font> - semicolon? separated list of strings
* <font color=red>sugar_version</font> - int.int
* <font color=blue>locale</font> - comma? separated list of strings
* <font color=green>category</font> - string
* <font color=purple>bundle_id</font> (change name for clarity, standard style?)
* <font color=purple>bundle_id</font> (change name for clarity, standard style?)
* <font color=purple>icon</font> - filename (change name for clarity; add path?)
* <font color=purple>icon</font> - filename (change name for clarity; add path?)
* <font color=purple>exec</font> - shell invocation (change name for clarity?)
* <font color=purple>exec</font> - shell invocation (change name for clarity?)
* <font color=blue>mime_types</font> - semicolon? separated list of strings
* <font color=red>version</font> - int.int
* <font color=blue>locale</font> - comma? separated list of strings
* <font color=red>sugar_version</font> - int.int
* <font color=green>category</font> - string
* <font color=red>author</font> - string (creator?)
* <font color=red>author</font> - string (creator?)
* <font color=red>maintainer</font> - string, rfc822 email header
* <font color=red>maintainer</font> - string, rfc822 email header
Line 39: Line 39:
* <font color=red>last_update</font> - date
* <font color=red>last_update</font> - date
* <font color=red>url</font> - url (or 'project url')
* <font color=red>url</font> - url (or 'project url')
* <font color=red>collection_type</font> - type (for client-mediated display)


== files ==
== files ==
=== acknowldgements - required ===
=== acknowledgments - required ===

=== changelog - required ===
=== changelog - required ===
** person/author
** person/author
Line 49: Line 51:
=== [[bib_info]] -- optional ===
=== [[bib_info]] -- optional ===


The point of the [[bib_info|bib_info.xml]] is to make it easy for large-scale learning object repositories like DSpace, Fedora Repository Server, GLOBE, and ARIADNE to serve up .xo bundles. For this reason we chose a simple standard -- Dublin Core encoded in XML -- that plays well with others rather than create our own.
This is an optional file for those that want to add detailed bibliographic information to their .xo bundle. I have modified a set of elements originally submitted to OLPC by Christine Madsen, now of the Oxford Internet Institute. We don't expect the average person to input this information but academics, educators, and thinkers might take the time to record this information. If they do take the time, here is a standard they can use.

These elements are taken from the [http://dublincore.org/documents/usageguide/elements.shtml Basic Dublin Core metadata elements].

# TITLE
# AUTHOR_NAME
# CREATION_DATE YYYY-MM-DD, YYYY-MM, or YYYY
# COUNTRY (?Is this really necessary? Isn't Language enough?) [http://www.iso.org/iso/country_codes/iso_3166_code_lists/english_country_names_and_code_elements.htm ISO 3166 Country codes]
# LANGUAGE [http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes ISO 639-1 codes]
# SUBJECT This is the same as Category, Should be one of top-level Categories from the [[Sample_library.info_file| Sample Library Info file]]
# KEYWORDS (?Necessary? when we might use separate tags?)
# OBJECT_TYPE
# FORMAT [http://www.iana.org/assignments/media-types/ Basically a MIME Type]
# NATIVE_ID ID from external data provider, like a Course ID
# CONTRIBUTOR_NAME
# COLLECTION_NAME
# COLLECTION_URL
# SUBMISSION_DATE YYYY-MM-DD, YYYY-MM, or YYYY
# URL
# LEARNING_LEVEL primary, secondary, university
# COPYRIGHT_TYPE e.g. CC-By 3.0, MIT License, GPL v2 but is identified by URL


Here is a sample bib_info file in XML format (coming). If [[Canonical_JSON]] is a better way to do this, I would love to learn how to use it. --[[User:Berrybw|Bryan Berry]]

Notes:
* Need to set up Controlled Vocabulary for OBJECT_TYPE
* The COPYRIGHT_TYPE is designated by a URL to the appropriate license
http://creativecommons.org/licenses/by/3.0/ -- Default
http://creativecommons.org/licenses/by-nc/3.0/
http://creativecommons.org/licenses/by-nc-sa/3.0/
http://creativecommons.org/licenses/by-sa/3.0/

Here are some good links on Metadata and Controlled Vocabularies
* Wikipedia [http://en.wikipedia.org/wiki/Controlled_vocabulary Controlled Vocabularies]
* [http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_ What is a Controlled Vocabulary?]
* Cory Doctorow's [http://www.well.com/~doctorow/metacrap.htm MetaCrap]


Basically, a Controlled Vocabulary is a set of standard values This is pretty important for the Copyright Type, language, and country. Less so for the other elements.


=== manifest ===
=== manifest ===

Latest revision as of 09:08, 20 October 2008

OLPC bundles (currently activity bundles and collections, though others have been discussed from time to time) contain a set of metadata in a .info file. Currently a slightly different set of data is gathered for each type (see a sample library.info file for a collection).

A separate set of metadata again is gathered via specfiles for rpms that are included in latest builds. So to include your latest version of an activity, the maintainer would need to

update the .info file inside its bundle, and repackage the .xo[l] file
update the .spec file inside the rpm defining the bundle
put the new spec into joyride
update the wiki or other source repository with the latest version of the bundle file.

As of late 2007, there is some agreement that the metadata tracked in bundle info files needs to be expanded.

Draft proposal for updating bundle metadata requirements

Below is an early draft of a suggested set of fields for a unified .info file that can be use for all bundles (including generic bundles of a few files put together in order to share them). Depending on the context, some of these fields will not be used or useful.

Included is a set of suggested required files that all bundles should contain. Currently only the .info file and a Manifest are required; the latter and some of the fields of the former is not in use, while other files such as a uniformly-named file for acknowledgements and copyright/licensing details, have not been named.

Please add to the discussion, add or remove suggested fields, or alter the lines drawn between required files in a config directory and required fields in the .info file itself. For instance, changlog data that is kept directly within .spec files is here relegated to a required changelog.


suggested unified info file

blue : current field, same meaning
purple : current field, some change in meaning; or name change to make it clearer
green : current field for one of the .info/.spec formats we use
red : new desired field; either replacing existing but unused field or adding new information
  • name - string
  • long_name - or better "pretty_name", for display (could be shorter than name
  • mime_types - semicolon? separated list of strings
  • locale - comma? separated list of strings
  • category - string
  • bundle_id (change name for clarity, standard style?)
  • icon - filename (change name for clarity; add path?)
  • exec - shell invocation (change name for clarity?)
  • version - int.int
  • sugar_version - int.int
  • author - string (creator?)
  • maintainer - string, rfc822 email header
  • license - fedora license fields
  • last_update - date
  • url - url (or 'project url')
  • collection_type - type (for client-mediated display)

files

acknowledgments - required

changelog - required

    • person/author
    • merge information : specific versions of projects being branched/borrowed from
    • date/time

bib_info -- optional

The point of the bib_info.xml is to make it easy for large-scale learning object repositories like DSpace, Fedora Repository Server, GLOBE, and ARIADNE to serve up .xo bundles. For this reason we chose a simple standard -- Dublin Core encoded in XML -- that plays well with others rather than create our own.


manifest

tags

There was debate about how to handle this. A section in bib_info, including file sections for the collection as a whole or named subcollections? A separate file with a set of all relevant tags for the whole bundle, one per line? An entry in the .info file?