Bundle metadata

From OLPC
Revision as of 01:04, 31 January 2008 by Annegentle (talk | contribs) (acknowldgements - required)
Jump to: navigation, search

OLPC bundles (currently activity bundles and content bundles, though others have been discussed from time to time) contain a set of metadata in a .info file. Currently a slightly different set of data is gathered for activity bundles and library bundles.

A separate set of metadata again is gathered via specfiles for rpms that are included in latest builds. So to include your latest version of an activity, the maintainer would need to

update the .info file inside its bundle, and repackage the .xo[l] file
update the .spec file inside the rpm defining the bundle
put the new spec into joyride
update the wiki or other source repository with the latest version of the bundle file.

As of late 2007, there is some agreement that the metadata tracked in bundle info files needs to be expanded.

Draft proposal for updating bundle metadata requirements

Below is an early draft of a suggested set of fields for a unified .info file that can be use for all bundles (including generic bundles of a few files put together in order to share them). Depending on the context, some of these fields will not be used or useful.

Included is a set of suggested required files that all bundles should contain. Currently only the .info file and a Manifest are required; the latter and some of the fields of the former is not in use, while other files such as a uniformly-named file for acknowledgements and copyright/licensing details, have not been named.

Please add to the discussion, add or remove suggested fields, or alter the lines drawn between required files in a config directory and required fields in the .info file itself. For instance, changlog data that is kept directly within .spec files is here relegated to a required changelog.


suggested unified info file

blue : current field, same meaning
purple : current field, some change in meaning; or name change to make it clearer
green : current field for one of the .info/.spec formats we use
red : new desired field; either replacing existing but unused field or adding new information
  • name - string
  • long_name - or better "pretty_name", for display (could be shorter than name
  • version - int.int
  • sugar_version - int.int
  • bundle_id (change name for clarity, standard style?)
  • icon - filename (change name for clarity; add path?)
  • exec - shell invocation (change name for clarity?)
  • mime_types - semicolon? separated list of strings
  • locale - comma? separated list of strings
  • category - string
  • author - string (creator?)
  • maintainer - string, rfc822 email header
  • license - fedora license fields
  • last_update - date
  • url - url (or 'project url')

files

acknowledgments - required

changelog - required

    • person/author
    • merge information : specific versions of projects being branched/borrowed from
    • date/time

bib_info -- optional

This is an optional file for those that want to add detailed bibliographic information to their .xo bundle. I have modified a set of elements originally submitted to OLPC by Christine Madsen, now of the Oxford Internet Institute. We don't expect the average person to input this information but academics, educators, and thinkers might take the time to record this information. If they do take the time, here is a standard they can use.

These elements are taken from the Basic Dublin Core metadata elements.

  1. TITLE
  2. AUTHOR_NAME
  3. CREATION_DATE YYYY-MM-DD, YYYY-MM, or YYYY
  4. COUNTRY (?Is this really necessary? Isn't Language enough?) ISO 3166 Country codes
  5. LANGUAGE ISO 639-1 codes
  6. SUBJECT This is the same as Category, Should be one of top-level Categories from the Sample Library Info file
  7. KEYWORDS (?Necessary? when we might use separate tags?)
  8. OBJECT_TYPE
  9. FORMAT Basically a MIME Type
  10. NATIVE_ID ID from external data provider, like a Course ID
  11. CONTRIBUTOR_NAME
  12. COLLECTION_NAME
  13. COLLECTION_URL
  14. SUBMISSION_DATE YYYY-MM-DD, YYYY-MM, or YYYY
  15. URL
  16. LEARNING_LEVEL primary, secondary, university
  17. COPYRIGHT_TYPE e.g. CC-By 3.0, MIT License, GPL v2 but is identified by URL


Here is a sample bib_info file in XML format (coming). If Canonical_JSON is a better way to do this, I would love to learn how to use it. --Bryan Berry

Notes:

  • Need to set up Controlled Vocabulary for OBJECT_TYPE
  • The COPYRIGHT_TYPE is designated by a URL to the appropriate license

http://creativecommons.org/licenses/by/3.0/ -- Default http://creativecommons.org/licenses/by-nc/3.0/ http://creativecommons.org/licenses/by-nc-sa/3.0/ http://creativecommons.org/licenses/by-sa/3.0/

Here are some good links on Metadata and Controlled Vocabularies

Basically, a Controlled Vocabulary is a set of standard values This is pretty important for the Copyright Type, language, and country. Less so for the other elements.

manifest

tags

There was debate about how to handle this. A section in bib_info, including file sections for the collection as a whole or named subcollections? A separate file with a set of all relevant tags for the whole bundle, one per line? An entry in the .info file?