Bundle metadata: Difference between revisions

From OLPC
Jump to navigation Jump to search
Line 76: Line 76:


Notes: Need to set up Controlled Vocabularies for OBJECT_TYPE, FORMAT, LEARNING_LEVEL, COPYRIGHT_TYPE,
Notes: Need to set up Controlled Vocabularies for OBJECT_TYPE, FORMAT, LEARNING_LEVEL, COPYRIGHT_TYPE,


Here are some good links on Metadata and Controlled Vocabularies
* Wikipedia [http://en.wikipedia.org/wiki/Controlled_vocabulary Controlled Vocabularies]
* [http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_ What is a Controlled Vocabulary?]
* Cory Doctorow's [http://www.well.com/~doctorow/metacrap.htm|MetaCrap]


=== manifest ===
=== manifest ===

Revision as of 19:16, 29 January 2008

OLPC bundles (currently activity bundles and content bundles, though others have been discussed from time to time) contain a set of metadata in a .info file. Currently a slightly different set of data is gathered for activity bundles and library bundles.

A separate set of metadata again is gathered via specfiles for rpms that are included in latest builds. So to include your latest version of an activity, the maintainer would need to

update the .info file inside its bundle, and repackage the .xo[l] file
update the .spec file inside the rpm defining the bundle
put the new spec into joyride
update the wiki or other source repository with the latest version of the bundle file.

As of late 2007, there is some agreement that the metadata tracked in bundle info files needs to be expanded.

Draft proposal for updating bundle metadata requirements

Below is an early draft of a suggested set of fields for a unified .info file that can be use for all bundles (including generic bundles of a few files put together in order to share them). Depending on the context, some of these fields will not be used or useful.

Included is a set of suggested required files that all bundles should contain. Currently only the .info file and a Manifest are required; the latter and some of the fields of the former is not in use, while other files such as a uniformly-named file for acknowledgements and copyright/licensing details, have not been named.

Please add to the discussion, add or remove suggested fields, or alter the lines drawn between required files in a config directory and required fields in the .info file itself. For instance, changlog data that is kept directly within .spec files is here relegated to a required changelog.


suggested unified info file

blue : current field, same meaning
purple : current field, some change in meaning; or name change to make it clearer
green : current field for one of the .info/.spec formats we use
red : new desired field; either replacing existing but unused field or adding new information
  • name - string
  • long_name - or better "pretty_name", for display (could be shorter than name
  • version - int.int
  • sugar_version - int.int
  • bundle_id (change name for clarity, standard style?)
  • icon - filename (change name for clarity; add path?)
  • exec - shell invocation (change name for clarity?)
  • mime_types - semicolon? separated list of strings
  • locale - comma? separated list of strings
  • category - string
  • author - string (creator?)
  • maintainer - string, rfc822 email header
  • license - fedora license fields
  • last_update - date
  • url - url (or 'project url')

files

acknowldgements - required

changelog - required

    • person/author
    • merge information : specific versions of projects being branched/borrowed from
    • date/time

bib_info

    • name/title/attrib/filename &c. for each leaf in a collection (use existing bibliiographic info standard)

This is an optional file for those that want to add detailed bibliographic information to their .xo bundle. I have modified a set of elements originally submitted to OLPC by Christine Madsen, now of the Oxford Internet Institute.

These elements are taken from the Basic Dublin Core metadata elements.

  1. TITLE
  2. AUTHOR_NAME
  3. CREATION_DATE
  4. COUNTRY (?Is this really necessary? Isn't Language enough?) ISO 3166 Country codes
  5. LANGUAGE ISO 639-1 codes
  6. SUBJECT This is the same as Category, Should be one of top-level Categories from the Sample Library Info file
  7. KEYWORDS (?Necessary? when we might use separate tags?)
  8. OBJECT_TYPE
  9. FORMAT Basically a MIME Type
  10. NATIVE_ID ID from external data provider, like a Course ID
  11. CONTRIBUTOR_NAME
  12. COLLECTION_NAME
  13. COLLECTION_URL
  14. SUBMISSION_DATE
  15. URL
  16. LEARNING_LEVEL primary, secondary, university
  17. COPYRIGHT_TYPE
  18. COPYRIGHT_URL

Here is a sample bib_info file in XML format (coming)

Notes: Need to set up Controlled Vocabularies for OBJECT_TYPE, FORMAT, LEARNING_LEVEL, COPYRIGHT_TYPE,


Here are some good links on Metadata and Controlled Vocabularies

manifest

tags

There was debate about how to handle this. A section in bib_info, including file sections for the collection as a whole or named subcollections? A separate file with a set of all relevant tags for the whole bundle, one per line? An entry in the .info file?