User:Homunq/Activity bundles v2
Design goals
Italics show options for implementation of a goal. Bolded phrases in quotes are for later use as shorthand for the italicized passage. Thus, "development versions" is just one of various possible ways to provide development versions.
- UI
- activity user: The format should support associating and versioning for different versions of the "same" activity, through a private key in some way associated with that activity. Each activity has a "master key" which remains unchanged over activity versions. This is a fundamental identifier of the activity, used in many searches of the activity bundle registry, and activities with the same master key are grouped (in submenus?) in most cases in the UI.
- activity developer:
- When used on the XO, the format should, insofar as possible, lessen the burden of key management. In other words, it should not be necessary to open two separate files (a bundle and a private key) simply in order to make and sign changes in a bundle. "signature metadata": The master key is used to sign a user specific key, granting it privileges to sign the bundle. The user key is managed by sugar and signatures are available to any activity on request. To contain abuse of these privileges by malactivities, these user signatures include inseparably-signed metadata including the hash and public master key of the requesting activity. Thus, the master key signature on the user public key must also include (inseparably-signed) metadata: which activity hash and/or master key are allowed to give valid signatures for that user.
- it should be possible to make and test development versions of an activity, without replacing or un-defaulting the stable version. Activities without a valid signature are considered "development versions."
- Creating activity bundles is practical, whatever they are developed with. Ideal if tools are useable from outside sugar. Bundlebuilder.py looks in more than one place for anything it imports, directly or indirectly, from sugar, and "conditional imports" make it fail gracefully.
- security
- all algorithms should be reasonably cryptographically secure in the context of 2008. Excessive paranoia and extreme key lengths are unnecessary, but, all other things being equal, algorithms with known weaknesses should be avoided.
- The various likely developer workflows should be smooth enough that developers (and users) are not excessively tempted to take shortcuts with security. Desirable in this regard:
- possible to change translations without invalidating a signature
- possible for a user to change a whole set of journal entries with a given (mime type,creating activity) to default to a new handling activity. The UI for this should be somewhat more involved if the new activity is signed with a different key, but it should possible.
- Possible for an activity author to sign other people's patched versions in three ways: by signing them herself, by giving the other person non-delegatable authority to sign them, or by giving them delegatable authority (ie, the private key).
- The signatures can be validated in real-use environments, and not only on install. That is, the presence or absence of extraneous files such as *.pyc or .git/* do not break validation. Yet no file is considered "extraneous" in this sense without an explicit reason; there are no implicit ways to include an arbitrary file without breaking the signature. Certain files are considered "local", that is, excluded from the signature. All activities should work without these files, and there should be a command to delete all such files.
- efficiency
- the format should transfer well (supports compression)
- the format should interact well with delta-based storage.
- optional: the format should support selective transfer of subfiles (see XO_updater for an example of how this might work)
These three goals are somewhat at odds. For instance, zip satisfies 1 and 3 but, in some cases, not 2; tar is the best possible solution to 2, but fails 1 and 3 badly; whereas tgz with the --rsyncable option on gzip satisfies 1 and (mostly) 2, but not 3 (at least, not with standard transfer protocols, as tar format has no central index of offsets to the subfiles). I think ".tgz --rsyncable" is the best available option, but we should do some testing with rdiff and real activity bundles. Zip may win.
- After further discussion and searching the available activities, I found few examples where zip would be pathological on point 2 - that is, activities with large (>50K) files likely to be subject to frequent, small changes. Thus I think that the current format, "zip", is adequate.
Signatures
An activity bundle consists of a file whose nominal extension is .xo, but whose real structure corresponds to .zip, OR .tgz (interchangeable). The difference between these two structures is auto-detected.
Zip is for backward compatibility, .tgz is a better format for differential version control because, while it is compressed for sending over a network connection, it is also easily expanded to a form that is friendly to binary diffs.
The internal structure looks something like the following:
Helloworld.activity/ MANIFEST helloworld.py locale/ de_DE/ activity.linfo zh_CN/ activity.linfo activity/ activity.info icons/ helloworld.svg STRUCTURE/ ALLOWED_EXTRA_FILES HASHES SIGNATURES/ deadbeef/ deadbeef.pub cafeb0ef.pub.sig 12345678.pub.sig 12345678/ 12345678.pub A1A1A1A1/ a1a1a1a1.pub de.sig cafeb0ef/ cafeb0ef.sig HASHES.sig
All special files are assumed to be in UTF-8. All linefeeds are unix-style.
MANIFEST
All bundles must have a MANIFEST file. It consists of one entry per line, with \n at the end of every line including the last. A given entry is one of the following:
- the relative path of a signed bundle file.
No path may start with a character in the set [!@#$%*-+=<>]. Paths that end with the string '.py' cause the same path with a trailing 'c' (that is, X.pyc) to be considered a local file.
- the character '-' followed by the relative path of a local file
- the character '-' followed by the relative path of a local directory, ending in '/'. All files inside this directory or subdirectories thereof are considered local files.
- an empty line. This maintains the absolute line number of following entries, so that entries on the same line number in different versions of the bundle can be considered different versions of the same file.
When renaming a file, the renamed file should appear on the same line. When deleting a file, leaving a blank line is OPTIONAL and ENCOURAGED. Blank lines are DISCOURAGED but ALLOWED in other cases. Other whitespace not part of file names is NOT ALLOWED
maintaining order makes merges, delta-based storage, and import to source control tools easier
For the example above, MANIFEST would be exactly:
activity/activity.info helloworld.py icons/helloworld.svg
ALLOWED_EXTRA_FILES
This is a list of files or subdirectories which can be present without invalidating the signature. Only files not in the MANIFEST and not in the STRUCTURE directory will be checked against the patterns here. If there is any file which does NOT match a pattern listed here, then the signature is invalid. An example:
#Compiled python helloworld.pyc
#Git .gitignore .git/*
#Temporary justfoolingaround.tmp
#Translations po/* locale/*/*.mo
HASHES
- auto-generated file
- unix-style line endings
- first line '#HASH-VERSION: 1.0; HASH-FUNCTION:sha256'.
- No additional whitespace
- The further lines of HASHES alternate
- one line with a path as in MANIFEST
- the sha1sum hash of the binary contents of the file on the line which follows.
- There is no limit to line length.
- order same as MANIFEST
- The rest of HASHES follows MANIFEST, in the same order, excluding blank lines or any lines that match patterns in TRANSLATABLES.
SIGNATURES/
- optional directory
- Contents
- ...