Union File Systems

From OLPC
Revision as of 20:35, 31 May 2007 by Mcfletch (talk | contribs) (Initial draft of Union File System explorations)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The plan in BitFrost is to use a Union File System (AUFS) in order to provide recovery and similar operations. This is a straw-man proposal describing a use-patter for AUFS within BitFrost.

In all cases, unless otherwise stated the trees would be stored on the Flash drive (as regular directories). I'm here allowing for multiple-machine-user scenarios to hopefully get researchers outside of the project interested in working on the system.

Assumptions during general operation:

  • We will have a "core system" tree. This is the operating system up to the level of Sugar and application launching, security maintenance etceteras. It will be composed of something like the following:
    • r/o base image (P_SF_CORE protected images)
    • r/o system COW-generated update images (P_SF_CORE protected images)
      • These would likely be a chain of updates
        • e.g. system security/functionality updates for the laptop could be rolled back in the event of failure
        • Over time we would want to be able to migrate these changes into the r/o base image e.g. system updates > 2 months old would merge down to the base image
      • In other words, update process "SECURITY_UPDATE_2007_05_20" would create a new COW file system branch directly on the core r/o file system, do the updates using RPMs/scripts or whatever, and then remount the COW as a read-only branch
    • r/o "user" (root) system update images (P_SF_RUN protected images)
      • I'd assume we'd want these to use date-based backups (See the user data activity area below for an idea on that)
      • Unclear how this works on a multi-user machine, but I guess that's something for external operations to figure out
      • Unclear whether we can apply these on top of an updated core system; it should be technically possible, but there might be conflicts in the semantics of the result
    • Not sure whether we can have a "sensitive data" overlay for things that should only be readable by the system (the idea being that we simply *not* provide access to that plane to applications' chroot)
      • Thinking of password files, any machine-level encryption keys and the like
  • We will have a temporary data file system in RAM rather than on Flash RAM, probably not unioned, possibly sharable?
  • For each installed activity (shared among users if there are multiple users?)
    • r/o system installation image
      • Generated during installation using a r/w COW on top of the core system image
      • Allow for new versions of dependencies or other system updates
    • r/o update/patch/plugin images
      • Not sure about utility of these, I think most platforms use full replacement for application updates these days
      • Plugins installation could be handled this way, but it doesn't really address any issues of plugin safety other than being able to remove them cleanly and the registration process would be constrained somewhat...
    • Installation should be *without* access to system's sensitive data overlay if possible
  • For each running application/activity (chroot image for a given user)
    • Tree:
      • Software: r/o core system (base, security, user overlays), r/o installation image (+ possible update images)
      • Software-as-Data: r/w user customisation image (probably only for "develop"-active activities and likely only per-user)
      • Data: r/o previous state images (data directories), r/w current state image (data directories)
    • r/o previous-state images (versioned)
      • monthly, weekly and daily backup snapshots
      • migration/coalescing over time so that after a week daily snapshots move into a weekly snapshot and after a month the weekly snapshots move into the core image for the activity
    • r/w current-working image
      • preferably instrumented to allow for monitoring changes and registering created/altered files in the journal
        • the file version registered in the journal would, however, only exist until coalescing, so the backup mechanism would need to have encrypted and uploaded the file to the backup service before coalescing occurred
        • in the simple case, when you close an activity, examine the file-system overlay for the session and see if there are files in the r/w overlay, if there are, something has changed, add it to the journal (the problem being activities that are left open for months won't get into the journal in that case, so the "per-day" version overlay replacement would need to do the journaling as well)
      • it would be elegant if we could allow sharing among applications by having the journal-based file selection make a file visible within the current-working image by hard-linking it into the appropriate "previous-state" image
        • That is, when you select a 120MB recording from the "video tutorial activity" in the journal that you want to present to your teacher in the "presentation activity" the journal would find the file-on-disk (or in backups, or whatever) and would hard-link it into the appropriate day's overlay for the *current* application. The current application would then have COW access to the file. Problems occur with naming conflicts in the current directory (particularly where you've already deleted a file of that name in a later session) might need to have a check before copying in and automatic renaming, which could be confusing. Almost want it to be a special r/o layer just below the r/w layer to make it work reliably there.
    • r/w unversioned image (sub-tree, e.g. "~/volatile and /var/volatile")
      • for storage of things like databases where legacy applications are storing history and the like internally in a big file that contains all of the user's work
  • For each user's "profile" (personal data such as general encryption key, backup/log encryption keys, identifying photo and the like)
    • Versioned storage, as for an activity, but only available (directly) to the system's blessed user-profile-editor activity

What that would require:

  • During boot (before mounting root), the user would need to be able to escape normal boot in order to enable/disable file system planes in the core system tree
    • To roll back to a previous day's state after playing (and breaking), un-check your last day's change plane
    • To undo a failed official system update, un-check the system update (would have to automatically un-check the system updates *since* then as well to maintain consistency)
      • We would want a way to configure permanent disabling/removal of a file-system plane, if someone leaves an overlay disabled until it would normally be integrated
      • What do we do with the user's customisations if there's an intermediate layer removed? Can we reliably detect conflicts?
  • During system software updates
    • Mechanism for creating new COW file system branch and adding to the registered sequence of "core system" overlays
      • Should automatically assign a user-friendly descriptive name for the selection box
      • Should automatically assign a directory name
      • Should automatically record date applied
      • Would likely be some form of simple standardised format for describing the overlays along with standard locations for the various overlays to be stored
  • Versioned storage (user's system overlay, user's profile, user's per-activity directories)
    • R/W COW system should version on write, that is, be available at all times "latest revisions" and create a new COW if there were changes older than X period on the latest revisions
      • System-level mechanism to replace current-day overlay with next-day overlay (if there is anything in the previous-day overlay, otherwise it just becomes the today-overlay)
    • Mechanism to coalesce older overlays into the core image (likely triggered by the backup scripts for the laptop so that we don't lose information when doing the coalescing (only coalesce if the versions that would be lost are already backed up))
      • Likely require ability to do user-triggered coalescing as well, in order to allow for resolving "out-of-space" conditions
    • For activity storage
      • Mechanism to access older overlays for restoration/information purposes
      • The journal should give you access to the versions of user files
        • A direct mechanism in file-system browsing views would be nice too so that you could compare versions of files easily
      • Service in security module would have to allow an activity to request stripping of a given r/o layer within the activity's versioned file plane sets (not the system sets)

Anyway, this is just a straw man to try to figure out if I've understood the approach. Obvious issues include performance (IIUC we would wind up with dozens of extra "stats" per normal access to a file) and issues related to versioning/semantic conflicts between overlays when intervening overlays are removed. The question of "library overlays" also pops up (e.g. a numpy overlay that would be shared among dozens of activities, but isn't necessarily a part of the "core" system).