OurStoriesXML

From OLPC
Revision as of 19:28, 7 April 2008 by Sj (talk | contribs) (..)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page documents the mark-up language used to keep meta data for interviews recorded by the Our_Stories activity.

Some fields will be automatically generated during interview recording, ie date and file reference (and probably language and country of origin / geo-data, as those could be automatically appended by the school servers). Other fields, such as description fields and keyword tags, are optionally entered from the interviewers, or filled in by reviewers or teachers going over the stories afterwards.

Quick Example (v0.1)

<?xml version="1.0" ?>
<ourstories version="0.1">
  <story date="2007-10-29 07:13:04" 
        filename="story_filename.ogg" 
        title="story_title"
        country="country"
        city="city"
        source="UNICEF"
        speaker="Ioanna P."/>
</ourstories>

Example (v0.2)

<?xml version="1.0" ?>
<ourstories version="0.2">
  <filesystem directory="/home/olpc/ourstories/" />
  <story metalanguage="en"
         source="St. Noah Girls school"
         title="swimming lessons, by Marie"
         thumbnailfile="marie.jpg"
         thumbnailtitle="marie with microphone"
         language="lg"
         date="2007-10-29 12:10:07 UTC">
     <description>
         Joba interviews Marie about her experiences with swimming.
     </description>
     <location country="Uganda"
               city = "Entebbe"
               latlong = "00.04N, 32.28E" />
     <media mediatype="audio" format="ogg"
            language="lg"
            recorddate="2007-10-29 07:13:04 UTC" 
            filename="marie.ogg" 
            filetitle="joba interviewing marie"
            people="Joba Buturo, Marie Frey"
            title="swimming lessons" />
     <media mediatype="image"  
            recorddate="2007-10-29 09:11:50 UTC" 
            filename="marie2.jpg" 
            filetitle="joba and marie"
            people="Joba Buturo, Marie Frey"
            title="photo together" /> 
  </story>
</ourstories>


Desired tags and fields as of 2/2008:

  • tag: Ourstories
    • version - see below; version of the xml format being used. Currently 0.2

  • tag: Filesystem
    • directory - the directory in which files are stored

  • tag: Story (usually just one, with data in english; could have one per localized language)
    • metalanguage - optional, default "en". the language in which the metadata in this xml file is recorded.
    • source - name of school/organization which collected the recording, e.g., "UNICEF", "Khairat School"
    • title - the title for link text on landing pages; if no title is available, we use "Story By <first name of speaker>;" if no name is available, we use "UNTITLED"
    • thumbnailfile - the name of the image file to be used as a photo or thumbnail. Please avoid using non-alphanumeric characters in filenames (e.g., punctuation and spaces; hyphens and underscores are ok).
    • thumbnailtitle - alternate text for the thumbnail
    • language - the language used to sort and display the story in searches; an ISO-639 code
    • date - yyyy-mm-dd hh:mm:ss TZN - the date used to sort and display the story; usually the last-recording date or the upload date
  • description - a short description
  • keywords - optional, keyword tags
  • tag: location (usually just one per story; the first location given is used to sort onto the world map)
    • country
    • city
    • latlong - latitude, longitude of the location of recording; as precise as anonymity allows, or can be later matched to the latlong of the server used to upload.
  • tag: media (often just one recording per story, audio or video)
    • mediatype - one of (audio, video, image, text)
    • format - optional; the file format used. e.g., .ogg or .wav or .mpeg
    • language - the primary language used for the interview, by ISO-639 language code. (TODO: how to handle multiple langs? comma-separated?)
    • recorddate - yyyy-mm-dd hh:mm:ss TZN
    • people - people involved in / recorded in the story, names separated by commas. If none are given, we use "Anonymous"
    • filename - the name of the media file (alphanumeric)
    • filetitle - the title of the file, a brief description


All fields should contain UTF-8 text strings.

Other fields to be added:

  • Name of server used to upload
  • multiple authors
    • Transcribers
    • Translators
    • Dubbing/translation notes

Other data to gather on OurStories

  • Narrative flow, unstructured groupings
  • External tags / additions to the above (cf. how Connexions does it)

Parsing

Versioning

version 0.2 - 2/25/2008

  • Updated/unified xml fields.
  • Separated fields into subtags based on the possibility of multiple values
  • Added metalocalization of xml data

version 0.1 - 10/29/2007

  • Initial attempt to define a XML style meta-data mark-up language.
  • Fields available: date, oggfile, title as attributes of a story

Interested participants

  • Allan Doyle
  • John Huang
  • ...