User:Homunq/Activity bundles v2

From OLPC
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Design goals

Italics show options for implementation of a goal. Bolded phrases in quotes are for later use as shorthand for the italicized passage. Thus, "development versions" is just one of various possible ways to provide development versions.

  • UI
  • activity user: The format should support associating and versioning for different versions of the "same" activity, through a private key in some way associated with that activity. Each activity has a "master key" which remains unchanged over activity versions. This is a fundamental identifier of the activity, used in many searches of the activity bundle registry, and activities with the same master key are grouped (in submenus?) in most cases in the UI.
  • activity developer:
  • When used on the XO, the format should, insofar as possible, lessen the burden of key management. In other words, it should not be necessary to open two separate files (a bundle and a private key) simply in order to make and sign changes in a bundle. "signature metadata": The master key is used to sign a user specific key, granting it privileges to sign the bundle. The user key is managed by sugar and signatures are available to any activity on request. To contain abuse of these privileges by malactivities, these user signatures include inseparably-signed metadata including the hash and public master key of the requesting activity. Thus, the master key signature on the user public key must also include (inseparably-signed) metadata: which activity hash and/or master key are allowed to give valid signatures for that user.
  • it should be possible to make and test development versions of an activity, without replacing or un-defaulting the stable version. Activities without a valid signature are considered "development versions."
  • Creating activity bundles is practical, whatever they are developed with. Ideal if tools are useable from outside sugar. Bundlebuilder.py looks in more than one place for anything it imports, directly or indirectly, from sugar, and "conditional imports" make it fail gracefully.
  • security
  • all algorithms should be reasonably cryptographically secure in the context of 2008. Excessive paranoia and extreme key lengths are unnecessary, but, all other things being equal, algorithms with known weaknesses should be avoided.
  • The various likely developer workflows should be smooth enough that developers (and users) are not excessively tempted to take shortcuts with security. Desirable in this regard:
  • possible to change translations without invalidating a signature
  • possible for a user to change a whole set of journal entries with a given (mime type,creating activity) to default to a new handling activity. The UI for this should be somewhat more involved if the new activity is signed with a different key, but it should possible.
  • Possible for an activity author to sign other people's patched versions in three ways: by signing them herself, by giving the other person non-delegatable authority to sign them, or by giving them delegatable authority (ie, the private key).
  • other things...
  • efficiency
  • the format should transfer well (supports compression)
  • the format should interact well with delta-based storage.
  • optional: the format should support selective transfer of subfiles (see XO_updater for an example of how this might work)

These three goals are somewhat at odds. For instance, zip satisfies 1 and 3 but, in some cases, not 2; tar is the best possible solution to 2, but fails 1 and 3 badly; whereas tgz with the --rsyncable option on gzip satisfies 1 and (mostly) 2, but not 3 (at least, not with standard transfer protocols, as tar format has no central index of offsets to the subfiles). I think ".tgz --rsyncable" is the best available option, but we should do some testing with rdiff and real activity bundles. Zip may win.

After further discussion and searching the available activities, I found few examples where zip would be pathological on point 2 - that is, activities with large (>50K) files likely to be subject to frequent, small changes. Thus I think that the current format, "zip", is adequate.

Signatures

An activity bundle consists of a file whose nominal extension is .xo, but whose real structure corresponds to .zip, OR .tgz (interchangeable). The difference between these two structures is auto-detected.

Zip is for backward compatibility, .tgz is a better format for differential version control because, while it is compressed for sending over a network connection, it is also easily expanded to a form that is friendly to binary diffs.

The internal structure looks something like the following:

Helloworld.activity/
    helloworld.py
    locale/
        de_DE/
            activity.linfo
        zh_CN/
            activity.linfo
    activity/
        activity.info
    icons/
        helloworld.svg
    STRUCTURE/
        MANIFEST
        TRANSLATABLES
        HASHES
        TRANSLATABLE_HASHES/
            zh_de
            de
            es
        SIGNATURES/
            deadbeef/
                deadbeef.pub
                cafeb0ef.pub.sig
                12345678.pub.sig
            12345678/
                12345678.pub
            A1A1A1A1/
                a1a1a1a1.pub
                de.sig
            cafeb0ef/
                cafeb0ef.sig
                HASHES.sig


All special files are assumed to be in UTF-8. All linefeeds are unix-style.

  1. MANIFEST
    1. Mandatory
    2. list of realative paths of all files and directories (with trailing dash) in bundle, except:
      1. non-empty directories (only empty directories should be in MANIFEST)
      2. MANIFEST
      3. HASHES
      4. any files in SIGNATURES
      5. any files matched by a pattern in TRANSLATABLES (see below)
      6. all files matched by TRANSLATABLES are optional. If not included, the bundle builder may decide in any manner it wishes whether to include them in the bundle or not.
    3. Ends with a newline
    4. Order is significant, and should be maintained
      1. When renaming a file, the renamed file should appear on the same line
      2. When deleting a file, leaving a blank line is OPTIONAL and ENCOURAGED. Blank lines are DISCOURAGED but ALLOWED in other cases.
    5. Other whitespace not part of file names is NOT ALLOWED

maintaining order makes merges, delta-based storage, and import to source control tools easier

For the example above, MANIFEST would be exactly:

helloworld.py
icons/helloworld.svg
  1. TRANSLATABLES
    1. optional file, but MUST be included in MANIFEST if it exists.
    2. uses the same format as .gitignore
    3. indicates what should NOT be included in MANIFEST or HASHES
    4. bundle is INVALID and should be rejected if TRANSLATABLES matches TRANSLATABLES, HASHES, or MANIFEST

Obviously, this does introduce a security risk, as an unsigned TRANSLATABLES file could theoretically cause a buffer overflow (or, indeed, be deliberately run by malicious signed code). The first case is considered avoidable with good tools, and the second is considered to be no additional risk (malicious code is malicious code)

L10n should not require bugging the original activity author - imagine having to sign multiple versions of dozens of languages for every activity version, besides the inconvenience it is a security risk because the underlying code could change between "pure-l10n" versions without the author realizing.

  1. HASHES
    1. auto-generated file
    2. unix-style line endings
    3. first line '#HASH-VERSION: 1.0; HASH-FUNCTION:sha256'.
    4. No additional whitespace
    5. The further lines of HASHES alternate
      1. one line with a path as in MANIFEST
      2. the sha1sum hash of the binary contents of the file on the line which follows.
    6. There is no limit to line length.
    7. order same as MANIFEST
      1. The rest of HASHES follows MANIFEST, in the same order, excluding blank lines or any lines that match patterns in TRANSLATABLES.


  1. SIGNATURES/
    1. optional directory
  2. Contents
  3. ...