Union File Systems

From OLPC
Jump to navigation Jump to search

The plan in BitFrost is to use a Union File System (AUFS) in order to provide recovery and similar operations. This is a straw-man proposal describing a use-pattern for AUFS within BitFrost. See also Journal and Overlays for discussion of using the Union File System to support Journal operations for activities.

In all cases, unless otherwise stated the trees would be stored on the Flash drive (as regular directories). I'm here allowing for multiple-machine-user scenarios to hopefully get researchers outside of the project interested in working on the system.

Core System Software Tree

System software in this sense includes everything required to launch a new application, from the initrd loader on the boot partition up through the Sugar Shell (application launching user interface).

The primary goal here is to allow for stable "rollback" to previous versions of the system. The normal system-updating code would have to initiate updates using a protected API for creating new system overlays. The system-updating service will need to be protected by running solely from the root file-system (no update images involved). As such the system-updating service will need to be very stable.

  • r/o base image (P_SF_CORE protected images)
  • r/o system COW-generated update images (P_SF_CORE protected images)
    • These would likely be a chain of updates
      • e.g. system security/functionality updates for the laptop could be rolled back in the event of failure
      • Over time we would want to be able to migrate these changes into the r/o base image e.g. system updates > 2 months old would merge down to the base image
    • In other words, update process "SECURITY_UPDATE_2007_05_20" would create a new COW file system branch directly on the core r/o file system, do the updates using RPMs/scripts or whatever, and then remount the COW as a read-only branch
  • r/o "user" (root) system update images (P_SF_RUN protected images)
      • I'd assume we'd want these to use date-based backups (See the user data activity area below for an idea on that)
    • Unclear how this works on a multi-user machine, but I guess that's something for external operations to figure out
    • Unclear whether we can apply these on top of an updated core system; it should be technically possible, but there might be conflicts in the semantics of the result
  • Not sure whether we can have a "sensitive data" overlay for things that should only be readable by the system (the idea being that we simply *not* provide access to that plane to applications' chroot)
    • Thinking of password files, any machine-level encryption keys and the like

Temporary Storage Area

We will have a temporary data file system in RAM rather than on Flash RAM, probably not unioned, possibly sharable?

Application Installation Image

As per BitFrost, the application will be installed to a read-only file-system overlay on top of the core file-system (minus any "sensitive-data overlay"). This will allow the application to update shared libraries or otherwise perform "invasive" installations without affecting other applications or the system itself.

Will need to address how this applies in multiple-user scenarios.

  • r/o system image
  • r/o installation image
    • Generated during installation using a r/w COW on top of the core system image
      • Allow for new versions of dependencies or other system updates
    • r/o update/patch/plugin images
      • Not sure about utility of these, I think most platforms use full replacement for application updates these days
      • Plugins installation could be handled this way, but it doesn't really address any issues of plugin safety other than being able to remove them cleanly and the registration process would be constrained somewhat...
    • Installation should be *without* access to system's sensitive data overlay if possible

Application Session Images

This approach is somewhat too complex, see Journal and Overlays until I get them updated.

  • For each running application/activity (chroot image for a given user)
    • Tree:
      • Software: r/o core system (base, security, user overlays), r/o installation image (+ possible update images)
      • Software-as-Data: r/w user customisation image (probably only for "develop"-active activities and likely only per-user)
      • Data: r/o previous state images (data directories), r/w current state image (data directories)
    • r/o previous-state images (versioned)
      • monthly, weekly and daily backup snapshots
      • migration/coalescing over time so that after a week daily snapshots move into a weekly snapshot and after a month the weekly snapshots move into the core image for the activity
    • r/w current-working image
      • preferably instrumented to allow for monitoring changes and registering created/altered files in the journal
        • the file version registered in the journal would, however, only exist until coalescing, so the backup mechanism would need to have encrypted and uploaded the file to the backup service before coalescing occurred
        • in the simple case, when you close an activity, examine the file-system overlay for the session and see if there are files in the r/w overlay, if there are, something has changed, add it to the journal (the problem being activities that are left open for months won't get into the journal in that case, so the "per-day" version overlay replacement would need to do the journaling as well)
      • it would be elegant if we could allow sharing among applications by having the journal-based file selection make a file visible within the current-working image by hard-linking it into the appropriate "previous-state" image
        • That is, when you select a 120MB recording from the "video tutorial activity" in the journal that you want to present to your teacher in the "presentation activity" the journal would find the file-on-disk (or in backups, or whatever) and would hard-link it into the appropriate day's overlay for the *current* application. The current application would then have COW access to the file. Problems occur with naming conflicts in the current directory (particularly where you've already deleted a file of that name in a later session) might need to have a check before copying in and automatic renaming, which could be confusing. Almost want it to be a special r/o layer just below the r/w layer to make it work reliably there.
    • r/w unversioned image (sub-tree, e.g. "~/volatile and /var/volatile")
      • for storage of things like databases where legacy applications are storing history and the like internally in a big file that contains all of the user's work
  • For each user's "profile" (personal data such as general encryption key, backup/log encryption keys, identifying photo and the like)
    • Versioned storage, as for an activity, but only available (directly) to the system's blessed user-profile-editor activity

Services Required

Overlay Manager

About the only software that should run outside the core chroot. This is the software that knows how to create and tear down overlay file systems.

  • Update software from trusted repositories (system images)
    • Generate and register new system overlays for the core system images automatically
      • Should automatically assign a user-friendly descriptive name for the selection box
      • Should automatically assign a directory name
      • Should automatically record date applied
      • Would likely be some form of simple standardised format for describing the overlays along with standard locations for the various overlays to be stored
    • Migrate overlays into core over time
  • Provide "play-spaces" for user overlays
    • Generate and register new user overlays for the core system images
    • Allow user to trigger merges down into base user overlay (or automatically merge if using a time-based version system)
  • Provide introspection/listing tools for finding overlays
    • Provide way to take given list of overlays and request subset of them as a mounted filesystem (to abstract away issues such as using unionfs or aufs)
    • Register user's last-selected set of system/user filesystem overlays (to make them the default for the next time)
  • Activity Installation
    • Register activity (permissions and the like)
    • Create activity's overlay filesystem
    • Install the activity inside the chroot filesystem with permission restrictions
    • Switch the filesystem to read only mode
  • Activity Instantiation
    • Lookup the activity, create the unioned filesystem
      • core system image (including user overlays)
      • r/o referenced file overlay
      • r/w created/altered file overlay (COW)
    • chroot the activity
    • Provide a file-open-request (dbus) for loading journal-based file(s) into namespace
      • Provide linking of files into the activity's r/o referenced file plane

User interface is required as well. During boot, the user needs to be able to escape normal boot sequence to respecify the set of file-planes to enable in the image. UI use cases:

  • To roll back to a previous day's state after playing (and breaking), un-check your last day's change plane
  • To undo a failed official system update, un-check the system update
    • Likely have to automatically un-check the system updates *since* then as well to maintain consistency
    • Some way to configure permanent disabling/removal of a file-system plane, if someone leaves an overlay disabled until it would normally be integrated
    • What do we do with the user's customisations if there's an intermediate layer removed? Can we reliably detect conflicts?

Time-versioned Storage

In cases where we would like a time-versioned storage system, it is possible to construct the image using a series of r/o "previous data" images and a r/w current image. Note, however, that time-versioned storage is probably sufficiently less fine-grained and easy-to-understand than "session-versioned" storage that we probably want to avoid time-versioned as much as possible.

  • R/W COW system should version on write, that is, be available at all times "latest revisions" and create a new COW if there were changes older than X period on the latest revisions
    • System-level mechanism to replace current-day overlay with next-day overlay (if there is anything in the previous-day overlay, otherwise it just becomes the today-overlay)
  • Mechanism to coalesce older overlays into the core image (likely triggered by the backup scripts for the laptop so that we don't lose information when doing the coalescing (only coalesce if the versions that would be lost are already backed up))
    • Likely require ability to do user-triggered coalescing as well, in order to allow for resolving "out-of-space" conditions

See Journal and Overlays for further exploration...

Issues

  • Performance (IIUC we would wind up with dozens of extra "stats" per normal access to a file) and issues related to versioning/semantic conflicts between overlays when intervening overlays are removed
  • "Library Overlays" also pops up (e.g. a numpy overlay that would be shared among dozens of activities, but isn't necessarily a part of the "core" system)