Union File Systems: Difference between revisions
(Initial draft of Union File System explorations) |
m (spelling) |
||
(16 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
{{translations}} |
|||
The plan in [[BitFrost]] is to use a Union File System (AUFS) in order to provide recovery and similar operations. This is a straw-man proposal describing a use-patter for AUFS within BitFrost. |
|||
The plan in [[Bitfrost]] is to use a Union File System (likely [http://aufs.sourceforge.net/ AUFS]) in order to provide recovery and similar operations. This is a straw-man proposal describing a use-pattern for AUFS within Bitfrost. See also [[Journal and Overlays]] for discussion of using a Union File System to support Journal operations for activities. |
|||
In all cases, unless otherwise stated the trees would be stored on the Flash drive (as regular directories). I'm here allowing for multiple-machine-user scenarios to hopefully get researchers outside of the project interested in working on the system. |
In all cases, unless otherwise stated the trees would be stored on the Flash drive (as regular directories). I'm here allowing for multiple-machine-user scenarios to hopefully get researchers outside of the project interested in working on the system. |
||
Experience would suggest that libraries, services and applications are often requiring updates to patch holes in their security (and to extend functionality). Further, those updates occasionally fail and cause damage themselves. The approach here proposes usage of the union file system to allow for reasonably frequent updates even to core systems and to allow for recovery from a botched update. |
|||
Assumptions during general operation: |
|||
* We will have a "core system" tree. This is the operating system up to the level of Sugar and application launching, security maintenance etceteras. It will be composed of something like the following: |
|||
== Core System Software Tree == |
|||
** r/o base image (P_SF_CORE protected images) |
|||
**r/o system COW-generated update images (P_SF_CORE protected images) |
|||
System software in this sense includes everything required to launch a new application, from the initrd loader on the boot partition up through the Sugar Shell (application launching user interface). |
|||
***These would likely be a chain of updates |
|||
****e.g. system security/functionality updates for the laptop could be rolled back in the event of failure |
|||
The primary goal here is to allow for stable "rollback" to previous versions of the system. The normal system-updating code would have to initiate updates using a protected API for creating new system overlays. |
|||
****Over time we would want to be able to migrate these changes into the r/o base image e.g. system updates > 2 months old would merge down to the base image |
|||
***In other words, update process "SECURITY_UPDATE_2007_05_20" would create a new COW file system branch directly on the core r/o file system, do the updates using RPMs/scripts or whatever, and then remount the COW as a read-only branch |
|||
Note: The system-updating service will need to be protected by running solely from the root file-system (no update images involved). As such the system-updating service will need to be very stable. |
|||
**r/o "user" (root) system update images (P_SF_RUN protected images) |
|||
***I'd assume we'd want these to use date-based backups (See the user data activity area below for an idea on that) |
|||
* r/o base image ([[OLPC Bitfrost#P_SF_CORE|P_SF_CORE]] protected images) |
|||
***Unclear how this works on a multi-user machine, but I guess that's something for external operations to figure out |
|||
*r/o system COW-generated update images ([[OLPC Bitfrost#P_SF_CORE|P_SF_CORE]] protected images) |
|||
***Unclear whether we can apply these on top of an updated core system; it should be technically possible, but there might be conflicts in the semantics of the result |
|||
**These would likely be a chain of updates |
|||
**Not sure whether we can have a "sensitive data" overlay for things that should only be readable by the system (the idea being that we simply *not* provide access to that plane to applications' chroot) |
|||
***e.g. system security/functionality updates for the laptop could be rolled back in the event of failure |
|||
***Thinking of password files, any machine-level encryption keys and the like |
|||
***Over time we would want to be able to migrate these changes into the r/o base image e.g. system updates > 2 months old would merge down to the base image |
|||
* We will have a temporary data file system in RAM rather than on Flash RAM, probably not unioned, possibly sharable? |
|||
**In other words, update process "SECURITY_UPDATE_2007_05_20" would create a new COW file system branch directly on the core r/o file system, do the updates using RPMs/scripts or whatever, and then remount the COW as a read-only branch |
|||
* For each installed activity (shared among users if there are multiple users?) |
|||
**Note: the overlay management service must run from the P_SF_CORE area, they should not be run with user overlays to prevent users from overriding the rules built into it until they know the implications |
|||
**r/o system installation image |
|||
*r/o "user" (root) system update images ([[OLPC Bitfrost#P_SF_RUN|P_SF_RUN]] protected images) |
|||
***Generated during installation using a r/w COW on top of the core system image |
|||
**I'd assume we'd want these to use date-based backups (See the user data activity area below for an idea on that) |
|||
**Unclear how this works on a multi-user machine, but I guess that's something for external operations to figure out |
|||
**Unclear whether we can apply these on top of an updated core system; it should be technically possible, but there might be conflicts in the semantics of the result |
|||
*Not sure whether we can have a "sensitive data" overlay for things that should only be readable by the system (the idea being that we simply *not* provide access to that plane to applications' chroot) |
|||
**Thinking of password files, any machine-level encryption keys and the like |
|||
**Bitfrost suggests that we'll only copy in individual libraries that the application claims it needs, but that requires altering the installation mechanisms of the application, where it would be faster and less porting work to simply "wrap" the installers of standard packages. i.e. a standard RPM-based GUI installer (click-and-run) could be wrapped so that it requests creation of the overlay, and runs the standard installer in the chroot |
|||
== Temporary Storage Area == |
|||
We will have a temporary data file system in RAM rather than on Flash RAM, probably not unioned, possibly sharable? |
|||
== Application Installation Image == |
|||
As per [[Bitfrost]], the application will be installed to a read-only file-system overlay on top of the core file-system (minus any "sensitive-data overlay"). This will allow the application to update shared libraries or otherwise perform "invasive" installations without affecting other applications or the system itself. |
|||
Will need to address how this applies in multiple-user scenarios. If user K installs an "OpenOffice" which is malicious how does the other user know it is not the OpenOffice they expected? |
|||
*r/o system image |
|||
*r/o installation image |
|||
**Generated during installation using a r/w COW on top of the core system image |
|||
***Allow for new versions of dependencies or other system updates |
***Allow for new versions of dependencies or other system updates |
||
**r/o update/patch/plugin images |
**r/o update/patch/plugin images |
||
Line 26: | Line 49: | ||
***Plugins installation could be handled this way, but it doesn't really address any issues of plugin safety other than being able to remove them cleanly and the registration process would be constrained somewhat... |
***Plugins installation could be handled this way, but it doesn't really address any issues of plugin safety other than being able to remove them cleanly and the registration process would be constrained somewhat... |
||
**Installation should be *without* access to system's sensitive data overlay if possible |
**Installation should be *without* access to system's sensitive data overlay if possible |
||
* For each running application/activity (chroot image for a given user) |
|||
== Application Session Images == |
|||
**Tree: |
|||
***Software: r/o core system (base, security, user overlays), r/o installation image (+ possible update images) |
|||
This section discusses how to produce the run-time chroot images in which a particular session of a particular activity will run. The chroot is intended to implement many of the Bitfrost requirements for data-safety while still allowing for legacy applications to be run with a simple recompile for the platform. |
|||
***Software-as-Data: r/w user customisation image (probably only for "develop"-active activities and likely only per-user) |
|||
***Data: r/o previous state images (data directories), r/w current state image (data directories) |
|||
=== Journal Approach === |
|||
**r/o previous-state images (versioned) |
|||
***monthly, weekly and daily backup snapshots |
|||
This is the "pure" OLPC approach outlined in [[Journal and Overlays]], it assumes that we can overwrite the standard File Open dialogues in every activity running on the laptop to trigger a Journal browser that will run out-of-process and allow us to select a file for linking into the system. |
|||
***migration/coalescing over time so that after a week daily snapshots move into a weekly snapshot and after a month the weekly snapshots move into the core image for the activity |
|||
**r/w current-working image |
|||
* System Software |
|||
***preferably instrumented to allow for monitoring changes and registering created/altered files in the journal |
|||
* Application Installation Image |
|||
****the file version registered in the journal would, however, only exist until coalescing, so the backup mechanism would need to have encrypted and uploaded the file to the backup service before coalescing occurred |
|||
** User Customisation Layer (For IDEs and the like) |
|||
****in the simple case, when you close an activity, examine the file-system overlay for the session and see if there are files in the r/w overlay, if there are, something has changed, add it to the journal (the problem being activities that are left open for months won't get into the journal in that case, so the "per-day" version overlay replacement would need to do the journaling as well) |
|||
* Read-only referenced-file layer |
|||
***it would be elegant if we could allow sharing among applications by having the journal-based file selection make a file visible within the current-working image by hard-linking it into the appropriate "previous-state" image |
|||
** Files/journal environments link referenced values into this layer |
|||
****That is, when you select a 120MB recording from the "video tutorial activity" in the journal that you want to present to your teacher in the "presentation activity" the journal would find the file-on-disk (or in backups, or whatever) and would hard-link it into the appropriate day's overlay for the *current* application. The current application would then have COW access to the file. Problems occur with naming conflicts in the current directory (particularly where you've already deleted a file of that name in a later session) might need to have a check before copying in and automatic renaming, which could be confusing. Almost want it to be a special r/o layer just below the r/w layer to make it work reliably there. |
|||
* Read-write created-file layer |
|||
**r/w unversioned image (sub-tree, e.g. "~/volatile and /var/volatile") |
|||
** All file access queries link/generate files in this layer |
|||
***for storage of things like databases where legacy applications are storing history and the like internally in a big file that contains all of the user's work |
|||
** Journal queries might include such things as "all images" then add all of those images to the read-only layer |
|||
* Possible "volatile" layer mounted in a particular area for applications/users to use for volatile files (e.g. databases or similar files that should be backed up only in a single version) |
|||
=== Legacy (Versioned) Approach === |
|||
This is a less pure approach that does not assume the ability to override all File Open dialogues or the like. Instead of having a "clean" environment for each run of an activity, the activity keeps a collection of all files created by the activity over time in a versioned file-system tree. |
|||
This has the distinct disadvantage that the environment tends to get cluttered with things you don't need, while making importing of files into the environment from another application a bit of a pain (non-standard operation). |
|||
*Software: r/o core system (base, security, user overlays), r/o installation image (+ possible update images) |
|||
*Software-as-Data: r/w user customisation image (probably only for "develop"-active activities and likely only per-user) |
|||
*Data: r/o previous state images (data directories), r/w current state image (data directories) |
|||
*r/o previous-state images (versioned) |
|||
**monthly, weekly and daily backup snapshots |
|||
**migration/coalescing over time so that after a week daily snapshots move into a weekly snapshot and after a month the weekly snapshots move into the core image for the activity |
|||
*r/w current-working image |
|||
**preferably instrumented to allow for monitoring changes and registering created/altered files in the journal |
|||
***the file version registered in the journal would, however, only exist until coalescing, so the backup mechanism would need to have encrypted and uploaded the file to the backup service before coalescing occurred |
|||
***in the simple case, when you close an activity, examine the file-system overlay for the session and see if there are files in the r/w overlay, if there are, something has changed, add it to the journal (the problem being activities that are left open for months won't get into the journal in that case, so the "per-day" version overlay replacement would need to do the journaling as well) |
|||
**it would be elegant if we could allow sharing among applications by having the journal-based file selection make a file visible within the current-working image by hard-linking it into the appropriate "previous-state" image |
|||
**That is, when you select a 120MB recording from the "video tutorial activity" in the journal that you want to present to your teacher in the "presentation activity" the journal would find the file-on-disk (or in backups, or whatever) and would hard-link it into the appropriate day's overlay for the *current* application. The current application would then have COW access to the file. Problems occur with naming conflicts in the current directory (particularly where you've already deleted a file of that name in a later session) might need to have a check before copying in and automatic renaming, which could be confusing. Almost want it to be a special r/o layer just below the r/w layer to make it work reliably there. |
|||
*r/w unversioned image (sub-tree, e.g. "~/volatile and /var/volatile") |
|||
**for storage of things like databases where legacy applications are storing history and the like internally in a big file that contains all of the user's work |
|||
* For each user's "profile" (personal data such as general encryption key, backup/log encryption keys, identifying photo and the like) |
* For each user's "profile" (personal data such as general encryption key, backup/log encryption keys, identifying photo and the like) |
||
**Versioned storage, as for an activity, but only available (directly) to the system's blessed user-profile-editor activity |
**Versioned storage, as for an activity, but only available (directly) to the system's blessed user-profile-editor activity |
||
== Services Required == |
|||
What that would require: |
|||
== Overlay Manager == |
|||
* During boot (before mounting root), the user would need to be able to escape normal boot in order to enable/disable file system planes in the core system tree |
|||
**To roll back to a previous day's state after playing (and breaking), un-check your last day's change plane |
|||
About the only software that should run outside the core chroot. This is the software that knows how to create and tear down overlay file systems. |
|||
**To undo a failed official system update, un-check the system update (would have to automatically un-check the system updates *since* then as well to maintain consistency) |
|||
***We would want a way to configure permanent disabling/removal of a file-system plane, if someone leaves an overlay disabled until it would normally be integrated |
|||
* Update software from trusted repositories (system images) |
|||
***What do we do with the user's customisations if there's an intermediate layer removed? Can we reliably detect conflicts? |
|||
** Generate and register new system overlays for the core system images automatically |
|||
* During system software updates |
|||
**Mechanism for creating new COW file system branch and adding to the registered sequence of "core system" overlays |
|||
***Should automatically assign a user-friendly descriptive name for the selection box |
***Should automatically assign a user-friendly descriptive name for the selection box |
||
***Should automatically assign a directory name |
***Should automatically assign a directory name |
||
***Should automatically record date applied |
***Should automatically record date applied |
||
***Would likely be some form of simple standardised format for describing the overlays along with standard locations for the various overlays to be stored |
***Would likely be some form of simple standardised format for describing the overlays along with standard locations for the various overlays to be stored |
||
** Migrate overlays into core over time |
|||
* Versioned storage (user's system overlay, user's profile, user's per-activity directories) |
|||
* Provide "play-spaces" for user overlays |
|||
**R/W COW system should version on write, that is, be available at all times "latest revisions" and create a new COW if there were changes older than X period on the latest revisions |
|||
** Generate and register new user overlays for the core system images |
|||
***System-level mechanism to replace current-day overlay with next-day overlay (if there is anything in the previous-day overlay, otherwise it just becomes the today-overlay) |
|||
** Allow user to trigger merges down into base user overlay (or automatically merge if using a time-based version system) |
|||
**Mechanism to coalesce older overlays into the core image (likely triggered by the backup scripts for the laptop so that we don't lose information when doing the coalescing (only coalesce if the versions that would be lost are already backed up)) |
|||
* Provide introspection/listing tools for finding overlays |
|||
***Likely require ability to do user-triggered coalescing as well, in order to allow for resolving "out-of-space" conditions |
|||
** Provide way to take given list of overlays and request subset of them as a mounted filesystem (to abstract away issues such as using unionfs or aufs) |
|||
**For activity storage |
|||
** Register user's last-selected set of system/user filesystem overlays (to make them the default for the next time) |
|||
***Mechanism to access older overlays for restoration/information purposes |
|||
* Activity Installation |
|||
***The journal should give you access to the versions of user files |
|||
** Register activity (permissions and the like) |
|||
****A direct mechanism in file-system browsing views would be nice too so that you could compare versions of files easily |
|||
** Create activity's overlay filesystem |
|||
***Service in security module would have to allow an activity to request stripping of a given r/o layer within the activity's versioned file plane sets (not the system sets) |
|||
** Install the activity inside the chroot filesystem with permission restrictions |
|||
** Switch the filesystem to read only mode |
|||
* Activity Instantiation |
|||
** Lookup the activity, create the unioned filesystem |
|||
*** core system image (including user overlays) |
|||
*** r/o referenced file overlay |
|||
*** r/w created/altered file overlay (COW) |
|||
** chroot the activity |
|||
** Provide a file-open-request (dbus) for loading journal-based file(s) into namespace |
|||
*** Provide linking of files into the activity's r/o referenced file plane |
|||
User interface is required as well. During boot, the user needs to be able to escape normal boot sequence to respecify the set of file-planes to enable in the image. UI use cases: |
|||
*To roll back to a previous day's state after playing (and breaking), un-check your last day's change plane |
|||
*To undo a failed official system update, un-check the system update |
|||
** Likely have to automatically un-check the system updates *since* then as well to maintain consistency |
|||
** Some way to configure permanent disabling/removal of a file-system plane, if someone leaves an overlay disabled until it would normally be integrated |
|||
** What do we do with the user's customisations if there's an intermediate layer removed? Can we reliably detect conflicts? |
|||
== Time-versioned Storage == |
|||
In cases where we would like a time-versioned storage system, it is possible to construct the image using a series of r/o "previous data" images and a r/w current image. Note, however, that time-versioned storage is probably sufficiently less fine-grained and easy-to-understand than "session-versioned" storage that we probably want to avoid time-versioned as much as possible. |
|||
*R/W COW system should version on write, that is, be available at all times "latest revisions" and create a new COW if there were changes older than X period on the latest revisions |
|||
**System-level mechanism to replace current-day overlay with next-day overlay (if there is anything in the previous-day overlay, otherwise it just becomes the today-overlay) |
|||
*Mechanism to coalesce older overlays into the core image (likely triggered by the backup scripts for the laptop so that we don't lose information when doing the coalescing (only coalesce if the versions that would be lost are already backed up)) |
|||
**Likely require ability to do user-triggered coalescing as well, in order to allow for resolving "out-of-space" conditions |
|||
See [[Journal and Overlays]] for further exploration... |
|||
== Issues == |
|||
* Performance (IIUC we would wind up with dozens of extra "stats" per normal access to a file) and issues related to versioning/semantic conflicts between overlays when intervening overlays are removed |
|||
* "Library Overlays" also pops up (e.g. a numpy overlay that would be shared among dozens of activities, but isn't necessarily a part of the "core" system) |
|||
* Heuristics for cleanup and backup prioritisation (see [[Journal and Overlays]] for some discussion of this). |
|||
* Providing a way for applications such as email clients or browsers to have a persistent cross-instance data-store (e.g. the mail repository, or the cookie-set of the browser). Although that introduces a major attack vector, legacy applications will require the functionality. |
|||
[[Category:Security]] |
|||
Anyway, this is just a straw man to try to figure out if I've understood the approach. Obvious issues include performance (IIUC we would wind up with dozens of extra "stats" per normal access to a file) and issues related to versioning/semantic conflicts between overlays when intervening overlays are removed. The question of "library overlays" also pops up (e.g. a numpy overlay that would be shared among dozens of activities, but isn't necessarily a part of the "core" system). |
|||
[[Category:Software]] |
Latest revision as of 15:28, 1 June 2012
The plan in Bitfrost is to use a Union File System (likely AUFS) in order to provide recovery and similar operations. This is a straw-man proposal describing a use-pattern for AUFS within Bitfrost. See also Journal and Overlays for discussion of using a Union File System to support Journal operations for activities.
In all cases, unless otherwise stated the trees would be stored on the Flash drive (as regular directories). I'm here allowing for multiple-machine-user scenarios to hopefully get researchers outside of the project interested in working on the system.
Experience would suggest that libraries, services and applications are often requiring updates to patch holes in their security (and to extend functionality). Further, those updates occasionally fail and cause damage themselves. The approach here proposes usage of the union file system to allow for reasonably frequent updates even to core systems and to allow for recovery from a botched update.
Core System Software Tree
System software in this sense includes everything required to launch a new application, from the initrd loader on the boot partition up through the Sugar Shell (application launching user interface).
The primary goal here is to allow for stable "rollback" to previous versions of the system. The normal system-updating code would have to initiate updates using a protected API for creating new system overlays.
Note: The system-updating service will need to be protected by running solely from the root file-system (no update images involved). As such the system-updating service will need to be very stable.
- r/o base image (P_SF_CORE protected images)
- r/o system COW-generated update images (P_SF_CORE protected images)
- These would likely be a chain of updates
- e.g. system security/functionality updates for the laptop could be rolled back in the event of failure
- Over time we would want to be able to migrate these changes into the r/o base image e.g. system updates > 2 months old would merge down to the base image
- In other words, update process "SECURITY_UPDATE_2007_05_20" would create a new COW file system branch directly on the core r/o file system, do the updates using RPMs/scripts or whatever, and then remount the COW as a read-only branch
- Note: the overlay management service must run from the P_SF_CORE area, they should not be run with user overlays to prevent users from overriding the rules built into it until they know the implications
- These would likely be a chain of updates
- r/o "user" (root) system update images (P_SF_RUN protected images)
- I'd assume we'd want these to use date-based backups (See the user data activity area below for an idea on that)
- Unclear how this works on a multi-user machine, but I guess that's something for external operations to figure out
- Unclear whether we can apply these on top of an updated core system; it should be technically possible, but there might be conflicts in the semantics of the result
- Not sure whether we can have a "sensitive data" overlay for things that should only be readable by the system (the idea being that we simply *not* provide access to that plane to applications' chroot)
- Thinking of password files, any machine-level encryption keys and the like
- Bitfrost suggests that we'll only copy in individual libraries that the application claims it needs, but that requires altering the installation mechanisms of the application, where it would be faster and less porting work to simply "wrap" the installers of standard packages. i.e. a standard RPM-based GUI installer (click-and-run) could be wrapped so that it requests creation of the overlay, and runs the standard installer in the chroot
Temporary Storage Area
We will have a temporary data file system in RAM rather than on Flash RAM, probably not unioned, possibly sharable?
Application Installation Image
As per Bitfrost, the application will be installed to a read-only file-system overlay on top of the core file-system (minus any "sensitive-data overlay"). This will allow the application to update shared libraries or otherwise perform "invasive" installations without affecting other applications or the system itself.
Will need to address how this applies in multiple-user scenarios. If user K installs an "OpenOffice" which is malicious how does the other user know it is not the OpenOffice they expected?
- r/o system image
- r/o installation image
- Generated during installation using a r/w COW on top of the core system image
- Allow for new versions of dependencies or other system updates
- r/o update/patch/plugin images
- Not sure about utility of these, I think most platforms use full replacement for application updates these days
- Plugins installation could be handled this way, but it doesn't really address any issues of plugin safety other than being able to remove them cleanly and the registration process would be constrained somewhat...
- Installation should be *without* access to system's sensitive data overlay if possible
- Generated during installation using a r/w COW on top of the core system image
Application Session Images
This section discusses how to produce the run-time chroot images in which a particular session of a particular activity will run. The chroot is intended to implement many of the Bitfrost requirements for data-safety while still allowing for legacy applications to be run with a simple recompile for the platform.
Journal Approach
This is the "pure" OLPC approach outlined in Journal and Overlays, it assumes that we can overwrite the standard File Open dialogues in every activity running on the laptop to trigger a Journal browser that will run out-of-process and allow us to select a file for linking into the system.
- System Software
- Application Installation Image
- User Customisation Layer (For IDEs and the like)
- Read-only referenced-file layer
- Files/journal environments link referenced values into this layer
- Read-write created-file layer
- All file access queries link/generate files in this layer
- Journal queries might include such things as "all images" then add all of those images to the read-only layer
- Possible "volatile" layer mounted in a particular area for applications/users to use for volatile files (e.g. databases or similar files that should be backed up only in a single version)
Legacy (Versioned) Approach
This is a less pure approach that does not assume the ability to override all File Open dialogues or the like. Instead of having a "clean" environment for each run of an activity, the activity keeps a collection of all files created by the activity over time in a versioned file-system tree.
This has the distinct disadvantage that the environment tends to get cluttered with things you don't need, while making importing of files into the environment from another application a bit of a pain (non-standard operation).
- Software: r/o core system (base, security, user overlays), r/o installation image (+ possible update images)
- Software-as-Data: r/w user customisation image (probably only for "develop"-active activities and likely only per-user)
- Data: r/o previous state images (data directories), r/w current state image (data directories)
- r/o previous-state images (versioned)
- monthly, weekly and daily backup snapshots
- migration/coalescing over time so that after a week daily snapshots move into a weekly snapshot and after a month the weekly snapshots move into the core image for the activity
- r/w current-working image
- preferably instrumented to allow for monitoring changes and registering created/altered files in the journal
- the file version registered in the journal would, however, only exist until coalescing, so the backup mechanism would need to have encrypted and uploaded the file to the backup service before coalescing occurred
- in the simple case, when you close an activity, examine the file-system overlay for the session and see if there are files in the r/w overlay, if there are, something has changed, add it to the journal (the problem being activities that are left open for months won't get into the journal in that case, so the "per-day" version overlay replacement would need to do the journaling as well)
- it would be elegant if we could allow sharing among applications by having the journal-based file selection make a file visible within the current-working image by hard-linking it into the appropriate "previous-state" image
- That is, when you select a 120MB recording from the "video tutorial activity" in the journal that you want to present to your teacher in the "presentation activity" the journal would find the file-on-disk (or in backups, or whatever) and would hard-link it into the appropriate day's overlay for the *current* application. The current application would then have COW access to the file. Problems occur with naming conflicts in the current directory (particularly where you've already deleted a file of that name in a later session) might need to have a check before copying in and automatic renaming, which could be confusing. Almost want it to be a special r/o layer just below the r/w layer to make it work reliably there.
- preferably instrumented to allow for monitoring changes and registering created/altered files in the journal
- r/w unversioned image (sub-tree, e.g. "~/volatile and /var/volatile")
- for storage of things like databases where legacy applications are storing history and the like internally in a big file that contains all of the user's work
- For each user's "profile" (personal data such as general encryption key, backup/log encryption keys, identifying photo and the like)
- Versioned storage, as for an activity, but only available (directly) to the system's blessed user-profile-editor activity
Services Required
Overlay Manager
About the only software that should run outside the core chroot. This is the software that knows how to create and tear down overlay file systems.
- Update software from trusted repositories (system images)
- Generate and register new system overlays for the core system images automatically
- Should automatically assign a user-friendly descriptive name for the selection box
- Should automatically assign a directory name
- Should automatically record date applied
- Would likely be some form of simple standardised format for describing the overlays along with standard locations for the various overlays to be stored
- Migrate overlays into core over time
- Generate and register new system overlays for the core system images automatically
- Provide "play-spaces" for user overlays
- Generate and register new user overlays for the core system images
- Allow user to trigger merges down into base user overlay (or automatically merge if using a time-based version system)
- Provide introspection/listing tools for finding overlays
- Provide way to take given list of overlays and request subset of them as a mounted filesystem (to abstract away issues such as using unionfs or aufs)
- Register user's last-selected set of system/user filesystem overlays (to make them the default for the next time)
- Activity Installation
- Register activity (permissions and the like)
- Create activity's overlay filesystem
- Install the activity inside the chroot filesystem with permission restrictions
- Switch the filesystem to read only mode
- Activity Instantiation
- Lookup the activity, create the unioned filesystem
- core system image (including user overlays)
- r/o referenced file overlay
- r/w created/altered file overlay (COW)
- chroot the activity
- Provide a file-open-request (dbus) for loading journal-based file(s) into namespace
- Provide linking of files into the activity's r/o referenced file plane
- Lookup the activity, create the unioned filesystem
User interface is required as well. During boot, the user needs to be able to escape normal boot sequence to respecify the set of file-planes to enable in the image. UI use cases:
- To roll back to a previous day's state after playing (and breaking), un-check your last day's change plane
- To undo a failed official system update, un-check the system update
- Likely have to automatically un-check the system updates *since* then as well to maintain consistency
- Some way to configure permanent disabling/removal of a file-system plane, if someone leaves an overlay disabled until it would normally be integrated
- What do we do with the user's customisations if there's an intermediate layer removed? Can we reliably detect conflicts?
Time-versioned Storage
In cases where we would like a time-versioned storage system, it is possible to construct the image using a series of r/o "previous data" images and a r/w current image. Note, however, that time-versioned storage is probably sufficiently less fine-grained and easy-to-understand than "session-versioned" storage that we probably want to avoid time-versioned as much as possible.
- R/W COW system should version on write, that is, be available at all times "latest revisions" and create a new COW if there were changes older than X period on the latest revisions
- System-level mechanism to replace current-day overlay with next-day overlay (if there is anything in the previous-day overlay, otherwise it just becomes the today-overlay)
- Mechanism to coalesce older overlays into the core image (likely triggered by the backup scripts for the laptop so that we don't lose information when doing the coalescing (only coalesce if the versions that would be lost are already backed up))
- Likely require ability to do user-triggered coalescing as well, in order to allow for resolving "out-of-space" conditions
See Journal and Overlays for further exploration...
Issues
- Performance (IIUC we would wind up with dozens of extra "stats" per normal access to a file) and issues related to versioning/semantic conflicts between overlays when intervening overlays are removed
- "Library Overlays" also pops up (e.g. a numpy overlay that would be shared among dozens of activities, but isn't necessarily a part of the "core" system)
- Heuristics for cleanup and backup prioritisation (see Journal and Overlays for some discussion of this).
- Providing a way for applications such as email clients or browsers to have a persistent cross-instance data-store (e.g. the mail repository, or the cookie-set of the browser). Although that introduces a major attack vector, legacy applications will require the functionality.