Bundle metadata
OLPC bundles (currently activity bundles and content bundles, though others have been discussed from time to time) contain a set of metadata in a .info file. Currently a slightly different set of data is gathered for activity bundles and library bundles.
A separate set of metadata again is gathered via specfiles for rpms that are included in latest builds. So to include your latest version of an activity, the maintainer would need to
- update the .info file inside its bundle, and repackage the .xo[l] file
- update the .spec file inside the rpm defining the bundle
- put the new spec into joyride
- update the wiki or other source repository with the latest version of the bundle file.
As of late 2007, there is some agreement that the metadata tracked in bundle info files needs to be expanded.
Draft proposal for updating bundle metadata requirements
Below is an early draft of a suggested set of fields for a unified .info file that can be use for all bundles (including generic bundles of a few files put together in order to share them). Depending on the context, some of these fields will not be used or useful.
Included is a set of suggested required files that all bundles should contain. Currently only the .info file and a Manifest are required; the latter and some of the fields of the former is not in use, while other files such as a uniformly-named file for acknowledgements and copyright/licensing details, have not been named.
Please add to the discussion, add or remove suggested fields, or alter the lines drawn between required files in a config directory and required fields in the .info file itself. For instance, changlog data that is kept directly within .spec files is here relegated to a required changelog.
suggested unified info file
- blue : current field, same meaning
- purple : current field, some change in meaning; or name change to make it clearer
- green : current field for one of the .info/.spec formats we use
- red : new desired field; either replacing existing but unused field or adding new information
- name - string
- long_name - or better "pretty_name", for display (could be shorter than name
- version - int.int
- sugar_version - int.int
- bundle_id (change name for clarity, standard style?)
- icon - filename (change name for clarity; add path?)
- exec - shell invocation (change name for clarity?)
- mime_types - semicolon? separated list of strings
- locale - comma? separated list of strings
- category - string
- author - string (creator?)
- maintainer - string, rfc822 email header
- license - fedora license fields
- last_update - date
- url - url (or 'project url')
files
acknowldgements - required
changelog - required
- person/author
- merge information : specific versions of projects being branched/borrowed from
- date/time
bib_info -- optional
This is an optional file for those that want to add detailed bibliographic information to their .xo bundle. I have modified a set of elements originally submitted to OLPC by Christine Madsen, now of the Oxford Internet Institute. We don't expect the average person to input this information but academics, educators, and thinkers might take the time to record this information. If they do take the time, here is a standard they can use.
These elements are taken from the Basic Dublin Core metadata elements.
- TITLE
- AUTHOR_NAME
- CREATION_DATE YYYY-MM-DD, YYYY-MM, or YYYY
- COUNTRY (?Is this really necessary? Isn't Language enough?) ISO 3166 Country codes
- LANGUAGE ISO 639-1 codes
- SUBJECT This is the same as Category, Should be one of top-level Categories from the Sample Library Info file
- KEYWORDS (?Necessary? when we might use separate tags?)
- OBJECT_TYPE
- FORMAT Basically a MIME Type
- NATIVE_ID ID from external data provider, like a Course ID
- CONTRIBUTOR_NAME
- COLLECTION_NAME
- COLLECTION_URL
- SUBMISSION_DATE YYYY-MM-DD, YYYY-MM, or YYYY
- URL
- LEARNING_LEVEL primary, secondary, university
- COPYRIGHT_TYPE e.g. CC-By 3.0, MIT License, GPL v2 but is identified by URL
Here is a sample bib_info file in XML format (coming). If Canonical_JSON is a better way to do this, I would love to learn how to use it.
Notes:
- Need to set up Controlled Vocabulary for OBJECT_TYPE
- The COPYRIGHT_TYPE is designated by a URL to the appropriate license
http://creativecommons.org/licenses/by/3.0/ -- Default http://creativecommons.org/licenses/by-nc/3.0/ http://creativecommons.org/licenses/by-nc-sa/3.0/ http://creativecommons.org/licenses/by-sa/3.0/
Here are some good links on Metadata and Controlled Vocabularies
- Wikipedia Controlled Vocabularies
- What is a Controlled Vocabulary?
- Cory Doctorow's MetaCrap
Basically, a Controlled Vocabulary is a set of standard values This is pretty important for the Copyright Type, language, and country. Less so for the other elements.
manifest
tags
There was debate about how to handle this. A section in bib_info, including file sections for the collection as a whole or named subcollections? A separate file with a set of all relevant tags for the whole bundle, one per line? An entry in the .info file?