XO updater/lang-ru: Difference between revisions

From OLPC
Jump to navigation Jump to search
m (New page: TODO)
 
mNo edit summary
Line 1: Line 1:
{{OLPC}}{{ Translation
TODO
| lang = ru
| source = XO_updater
| version = 0000}}
{{Ongoing Translation}}
{{draft}}


<big>Обновление программного обеспечения для детских ноутбуков XO </big>

; Постановка проблемы и рамки: Цель данного документа установить механизм для обновления программного обеспечения на ноутбуке XO-1. В данном обсуждении под программным обеспечением понимется как системное программное обеспечение, такое как операционная система и различные системные службы, разрабатываемые под контролем OLPC и требующиеся для обеспечения базовой функциональности ноутбуку, так и любое прикладное программное обеспечение ("activities"), поставляемое как OLPC, так и сторонними производителями.

{{anchor |System updater}}
== System updater ==

=== Core goals ===

The three core goals of a software update tool (hereafter "updater")
for the
XO are as follows:

; Security : Given the initial age group of our users, it is the only reasonable solution to default to automatic detection and installation of updates, both to be able to apply security patches in a timely fashion, and to enable users to benefit from rapid development and improvements in the software they're using. Automatic updates, however, are a security issue unto themselves: compromising the update system in any way can provide an attacker with the ability to wreak havoc across entire installed bases of laptops while bypassing -- by design -- all the security measures on the machine.
: Therefore, the security of the updater is paramount and must be its first design goal.

; Uncompromising emphasis on fault-tolerance : Given the scale of our deployment, the relatively high complexity of our network stack when compared to currently-common deployments, the unreliability of Internet connectivity even when available, and perhaps most importantly our desire for participating countries to soon begin customizing the official OLPC OS images to best suit them, it is clear that our updater must be fault-tolerant. This is both in the simple sense -- cryptographic checksums need to be used to ensure updates were received correctly -- and in the more complex sense that the likelihood of a human error with regard to update preparation goes up proportionally to the number of different base OS images at play. A fault-tolerant updater will therefore allow _unconditional_ rollback of the most recently applied update. "Unconditional" here means that, barring the failure of other parts of the system which are dependencies of the updater (e.g. the filesystem), the updater must always know how to correctly unapply an applied update, even if the update was malformed.

; Low bandwidth : For much the same reasons (project scale, Internet access scarcity and unreliability) that require fault-tolerance from the updater, the tool must take maximum care to minimize data transfer requirements. This means, concretely, that a delta-based approach must be utilized by the updater, with a "keyframe" or "heavy" update being strictly a fallback in the unlikely case an update path cannot be constructed from the available or reachable delta sets.

=== Design ===

It is given, due to requirements imposed by the [[Bitfrost]] security
platform, that a laptop will attempt to make daily contact with the
OLPC anti-theft servers. During that interaction, the laptop will post
its system software version, and the response provided by the
anti-theft service will optionally contain a relative URL of a more
recent OS image.

If such a pointer has been received and the laptop is behind a known
school server, it will probe the school server via rsync at the provided
relative URL to determine whether the server has cached the update
locally. If the update is not available locally, the laptop will wait up
to 24 hours, checking approximately hourly whether the school server has
obtained the update. If at the end of this wait period the school server
still does not have a local copy of the update, it is assumed to be
malfunctioning, and the laptop will contact an upstream master server
directly by using the URL provided originally by the anti-theft service.

In any of these three cases (school server has update immediately,
school server has update after delay, upstream master has update), we
say the laptop has 'found an update source'.

Once an update source has been found, the laptop will invoke the
standard rsync tool over a plaintext (unsecured) connection via the
rsync protocol -- not piped through a shell of any kind -- to bring
its own files up to date with the more recent version of the
system. rsync uses a network-efficient binary diff algorithm which
satisfies goal 3.

=== Design note: peer-to-peer updates ===

It is desirable to provide "viral update" functionality at a later date,
such that two laptops with different software versions (and without any
notion of trust) can engage in an update to bring the laptop with the
older software fully up to date.

However, determining how to provide this functionality securely,
efficiently and elegantly is not feasible on the Gen1 FRS
timeline. Therefore, laptop-to-laptop updates will NOT be a part of the
updater that ships with the FRS image, and are a candidate for release
2-3 months after FRS.

=== Design note: rsync scalability ===

rsync is a known CPU hog on the server side. It would be absolutely
infeasible to support a very large number of users from a single rsync
server. This is far less of a problem in our scenario for three reasons:

; High branching factor : In all normal circumstances, the vast majority of the rsync traffic to our upstream servers will come from school servers, not individual laptops. If school servers are unavailable of malfunctioning, it is not the case that there will be a flood of requests from individual laptops, because it's likely that the school servers are those laptops' only gateway to the Internet.

; Element of randomness in anti-theft requests : Instead of hitting the update servers every hour on the hour, the laptops are already including an element of randomness in choosing when to contact the anti-theft service. This random delay propagates to the rsync requests, as well.

; In-depth stagger abilities on the server side : Because notification of new updates is performed by the anti-theft service which is aware of a laptop's locale, updates can be staggered over several days by country, region, or any other metric such as server load.

Additionally, some optimizations can be added to rsync proper to aid
with our use case, but such engineering will need to wait until after
FRS.

=== Implementation ===

In order to implement runtime file protection, Bitfrost relies on the
COW functionality of the Linux-VServer patchset. The functionality
imbues immutable hardlinks within a designated context with special
meaning: when broken by some destructive file operation, VServer will
replace these hardlinks with the content of the file they were pointing
to and apply the desired operation on the resulting copy.

The XO updater will run in a special context to which the security
service has exposed the entire underlying filesystem as a COW copy. The
updater will update this COW copy in-place with rsync. This COW
mechanism simply ensures no excess authority lies with the updater; any
failures or vulnerabilities in it do not propagate to the rest of the
system.

One file contained within each OS image will be its cryptographically
signed manifest; at the end of the rsync operation, the laptop will have
obtained that file. At this point, the updater will request that the
security service applies the update. Note that due to the nature of
rsync, we can stop and restart the network phase of a single update
several times as connectivity becomes available, and until we've
received the complete update.

The security service will terminate the updater and then analyze the
manifest and confirm the modified files in the updater's context exactly
match the expected OS image end-state. If any discrepancy is discovered,
the updater context will be discarded and the update operation aborted.

If the update is verified to be complete and correct, the security
service will mark it as such, and designate the files within it to be
the files exported into all newly-created containers. System service
containers will be restarted gracefully. If the image manifest did
not contain a header identifying that image as a high-priority update,
the update process ends here. Restartable services have been restarted,
and the rest of the system will be initialized from the update on
reboot.

If the update has been marked as high-priority, the user will be asked
to close applications and reboot his machine immediately. A timer will
run that will reboot the machine in 60 minutes if the user does not do
so. The high-priority timer can be disabled in the security center; its
purpose is merely to provide some extra protection to the youngest users
who cannot necessarily be expected to understand or comply with the
reboot request.

On boot, the first initialization script to run will perform a
pivot_root operation to the directory that currently holds the OS image
marked bootable by the security service. With the example above, it
would be the directory that belonged to the updater's context. If a key
is depressed during boot, however, the pivot_root is performed to the
_old_ bootable context, and the user presented a dialog asking whether
she would like to make the rollback permanent.

The kernel is the only special case to this handling: in the event that
a verified update contains an updated kernel, that kernel will be placed
into a predetermined place in the underlying filesystem by the security
service. OpenFirmware will preferentially boot this newer kernel unless
the rollback key combination is depressed during boot.

Notice that the update operation has been reduced to a simple state
toggle between (any) two OS images. In so doing, we have satisfied goals
1 and 2.

== Application updater ==

=== Design ===

The XO eschews traditional dependency-based approaches to package
management, making application upgrades somewhat difficult. The problem
is compounded by the fact that Bitfrost does not permit applications to
update themselves in-place, which is a common update method on platforms
such as Mac OS X and Windows.

When it comes to application updates, we wish to stay true to our goals
of security and low-bandwidth updates, but are willing to settle for
less fault tolerance as necessitated by the fact that most activities
won't be OLPC-written or maintained.

The design should make it possible to have a single tool that can
ascertain the existence of updated versions of any currently installed
activities, and then fetch and install those updates. It should do so
bandwidth-efficiently, such that files that are unchanged between
activity versions aren't downloaded as part of the update, and also such
that identical resources files packaged by multiple activities are never
downloaded more than once, or not at all if they already exist on the
system.

=== Implementation ===

A manifest file is added to the bundle format specification. The
manifest consists of the filename and strong cryptographic hash of every
file in the bundle. Another file is added, called 'origin', that
specifies a URL where updated activity bundles may be found, and a
public key which will be used to sign such updated bundles.

When a global activity update is initiated, the updater enumerates the
origins for all installed activities, then probes each one in turn to
determine which activities have available updates. The resulting
activity list is the 'available update set'.

The most up-to-date bundle for each activity in the set is accessed, and
the first several kilobytes downloaded. Since bundles are simple ZIP
files, the downloaded data will contain the ZIP file index which stores
byte offsets for the constituent compressed files. The updater then
locates the bundle manifest in each index and makes a HTTP request with
the respective byte range to each bundle origin. At the end of this
process, the updater has cheaply obtained a set of manifests of the
files in all available activity updates.

A local database of manifests of all installed activities is kept,
pruned only to records for files larger than a set size, e.g. 50
KB. The updater cross-references each manifest from the available
update set with the installed database, and then with other manifests
in the set. Files which exist locally and are also present in the
available update set aren't downloaded; the updater simply "plants"
the files in the right places. The same happens for identical files
present in multiple bundles in the available update set; they are only
downloaded once.

After a bundle (minus any redundant files) has been downloaded, it is
unpacked and reassembled (if it needs any of the files that haven't been
downloaded because they already exist). Cryptographic signature
verification is performed. If remaining disk space is larger than a
particular margin, e.g. 20%, then the context containing the older
version of the activity bundle is kept around, and the user given the
ability to perform rollback on the activity update. Otherwise, the old
version bundle is destroyed.


----


; Author :
Ivan Krstić
ivan AT laptop.org
One Laptop per Child
http://laptop.org

; Metadata :
Revision: Draft-14
Timestamp: Tue Jun 26 17:51:45 UTC 2007


END

----
* Wikified by [[user:xavi]] from the devel list [http://lists.laptop.org/pipermail/devel/2007-June/005506.html mail].

[[Category:Software]]
[[Category:Developers]]

Revision as of 02:32, 1 April 2008

  Эта страница контролируется группой OLPC.
  перевод XO_updater оригинал  
  english | español | русский | translations wanted   +/- изменения  
This is an on-going translation


Pencil.png NOTE: The contents of this page are not set in stone, and are subject to change!

This page is a draft in active flux ...
Please leave suggestions on the talk page.

Pencil.png


Обновление программного обеспечения для детских ноутбуков XO

Постановка проблемы и рамки
Цель данного документа установить механизм для обновления программного обеспечения на ноутбуке XO-1. В данном обсуждении под программным обеспечением понимется как системное программное обеспечение, такое как операционная система и различные системные службы, разрабатываемые под контролем OLPC и требующиеся для обеспечения базовой функциональности ноутбуку, так и любое прикладное программное обеспечение ("activities"), поставляемое как OLPC, так и сторонними производителями.

System updater

Core goals

The three core goals of a software update tool (hereafter "updater") for the XO are as follows:

Security
Given the initial age group of our users, it is the only reasonable solution to default to automatic detection and installation of updates, both to be able to apply security patches in a timely fashion, and to enable users to benefit from rapid development and improvements in the software they're using. Automatic updates, however, are a security issue unto themselves: compromising the update system in any way can provide an attacker with the ability to wreak havoc across entire installed bases of laptops while bypassing -- by design -- all the security measures on the machine.
Therefore, the security of the updater is paramount and must be its first design goal.
Uncompromising emphasis on fault-tolerance
Given the scale of our deployment, the relatively high complexity of our network stack when compared to currently-common deployments, the unreliability of Internet connectivity even when available, and perhaps most importantly our desire for participating countries to soon begin customizing the official OLPC OS images to best suit them, it is clear that our updater must be fault-tolerant. This is both in the simple sense -- cryptographic checksums need to be used to ensure updates were received correctly -- and in the more complex sense that the likelihood of a human error with regard to update preparation goes up proportionally to the number of different base OS images at play. A fault-tolerant updater will therefore allow _unconditional_ rollback of the most recently applied update. "Unconditional" here means that, barring the failure of other parts of the system which are dependencies of the updater (e.g. the filesystem), the updater must always know how to correctly unapply an applied update, even if the update was malformed.
Low bandwidth
For much the same reasons (project scale, Internet access scarcity and unreliability) that require fault-tolerance from the updater, the tool must take maximum care to minimize data transfer requirements. This means, concretely, that a delta-based approach must be utilized by the updater, with a "keyframe" or "heavy" update being strictly a fallback in the unlikely case an update path cannot be constructed from the available or reachable delta sets.

Design

It is given, due to requirements imposed by the Bitfrost security platform, that a laptop will attempt to make daily contact with the OLPC anti-theft servers. During that interaction, the laptop will post its system software version, and the response provided by the anti-theft service will optionally contain a relative URL of a more recent OS image.

If such a pointer has been received and the laptop is behind a known school server, it will probe the school server via rsync at the provided relative URL to determine whether the server has cached the update locally. If the update is not available locally, the laptop will wait up to 24 hours, checking approximately hourly whether the school server has obtained the update. If at the end of this wait period the school server still does not have a local copy of the update, it is assumed to be malfunctioning, and the laptop will contact an upstream master server directly by using the URL provided originally by the anti-theft service.

In any of these three cases (school server has update immediately, school server has update after delay, upstream master has update), we say the laptop has 'found an update source'.

Once an update source has been found, the laptop will invoke the standard rsync tool over a plaintext (unsecured) connection via the rsync protocol -- not piped through a shell of any kind -- to bring its own files up to date with the more recent version of the system. rsync uses a network-efficient binary diff algorithm which satisfies goal 3.

Design note: peer-to-peer updates

It is desirable to provide "viral update" functionality at a later date, such that two laptops with different software versions (and without any notion of trust) can engage in an update to bring the laptop with the older software fully up to date.

However, determining how to provide this functionality securely, efficiently and elegantly is not feasible on the Gen1 FRS timeline. Therefore, laptop-to-laptop updates will NOT be a part of the updater that ships with the FRS image, and are a candidate for release 2-3 months after FRS.

Design note: rsync scalability

rsync is a known CPU hog on the server side. It would be absolutely infeasible to support a very large number of users from a single rsync server. This is far less of a problem in our scenario for three reasons:

High branching factor
In all normal circumstances, the vast majority of the rsync traffic to our upstream servers will come from school servers, not individual laptops. If school servers are unavailable of malfunctioning, it is not the case that there will be a flood of requests from individual laptops, because it's likely that the school servers are those laptops' only gateway to the Internet.
Element of randomness in anti-theft requests
Instead of hitting the update servers every hour on the hour, the laptops are already including an element of randomness in choosing when to contact the anti-theft service. This random delay propagates to the rsync requests, as well.
In-depth stagger abilities on the server side
Because notification of new updates is performed by the anti-theft service which is aware of a laptop's locale, updates can be staggered over several days by country, region, or any other metric such as server load.

Additionally, some optimizations can be added to rsync proper to aid with our use case, but such engineering will need to wait until after FRS.

Implementation

In order to implement runtime file protection, Bitfrost relies on the COW functionality of the Linux-VServer patchset. The functionality imbues immutable hardlinks within a designated context with special meaning: when broken by some destructive file operation, VServer will replace these hardlinks with the content of the file they were pointing to and apply the desired operation on the resulting copy.

The XO updater will run in a special context to which the security service has exposed the entire underlying filesystem as a COW copy. The updater will update this COW copy in-place with rsync. This COW mechanism simply ensures no excess authority lies with the updater; any failures or vulnerabilities in it do not propagate to the rest of the system.

One file contained within each OS image will be its cryptographically signed manifest; at the end of the rsync operation, the laptop will have obtained that file. At this point, the updater will request that the security service applies the update. Note that due to the nature of rsync, we can stop and restart the network phase of a single update several times as connectivity becomes available, and until we've received the complete update.

The security service will terminate the updater and then analyze the manifest and confirm the modified files in the updater's context exactly match the expected OS image end-state. If any discrepancy is discovered, the updater context will be discarded and the update operation aborted.

If the update is verified to be complete and correct, the security service will mark it as such, and designate the files within it to be the files exported into all newly-created containers. System service containers will be restarted gracefully. If the image manifest did not contain a header identifying that image as a high-priority update, the update process ends here. Restartable services have been restarted, and the rest of the system will be initialized from the update on reboot.

If the update has been marked as high-priority, the user will be asked to close applications and reboot his machine immediately. A timer will run that will reboot the machine in 60 minutes if the user does not do so. The high-priority timer can be disabled in the security center; its purpose is merely to provide some extra protection to the youngest users who cannot necessarily be expected to understand or comply with the reboot request.

On boot, the first initialization script to run will perform a pivot_root operation to the directory that currently holds the OS image marked bootable by the security service. With the example above, it would be the directory that belonged to the updater's context. If a key is depressed during boot, however, the pivot_root is performed to the _old_ bootable context, and the user presented a dialog asking whether she would like to make the rollback permanent.

The kernel is the only special case to this handling: in the event that a verified update contains an updated kernel, that kernel will be placed into a predetermined place in the underlying filesystem by the security service. OpenFirmware will preferentially boot this newer kernel unless the rollback key combination is depressed during boot.

Notice that the update operation has been reduced to a simple state toggle between (any) two OS images. In so doing, we have satisfied goals 1 and 2.

Application updater

Design

The XO eschews traditional dependency-based approaches to package management, making application upgrades somewhat difficult. The problem is compounded by the fact that Bitfrost does not permit applications to update themselves in-place, which is a common update method on platforms such as Mac OS X and Windows.

When it comes to application updates, we wish to stay true to our goals of security and low-bandwidth updates, but are willing to settle for less fault tolerance as necessitated by the fact that most activities won't be OLPC-written or maintained.

The design should make it possible to have a single tool that can ascertain the existence of updated versions of any currently installed activities, and then fetch and install those updates. It should do so bandwidth-efficiently, such that files that are unchanged between activity versions aren't downloaded as part of the update, and also such that identical resources files packaged by multiple activities are never downloaded more than once, or not at all if they already exist on the system.

Implementation

A manifest file is added to the bundle format specification. The manifest consists of the filename and strong cryptographic hash of every file in the bundle. Another file is added, called 'origin', that specifies a URL where updated activity bundles may be found, and a public key which will be used to sign such updated bundles.

When a global activity update is initiated, the updater enumerates the origins for all installed activities, then probes each one in turn to determine which activities have available updates. The resulting activity list is the 'available update set'.

The most up-to-date bundle for each activity in the set is accessed, and the first several kilobytes downloaded. Since bundles are simple ZIP files, the downloaded data will contain the ZIP file index which stores byte offsets for the constituent compressed files. The updater then locates the bundle manifest in each index and makes a HTTP request with the respective byte range to each bundle origin. At the end of this process, the updater has cheaply obtained a set of manifests of the files in all available activity updates.

A local database of manifests of all installed activities is kept, pruned only to records for files larger than a set size, e.g. 50 KB. The updater cross-references each manifest from the available update set with the installed database, and then with other manifests in the set. Files which exist locally and are also present in the available update set aren't downloaded; the updater simply "plants" the files in the right places. The same happens for identical files present in multiple bundles in the available update set; they are only downloaded once.

After a bundle (minus any redundant files) has been downloaded, it is unpacked and reassembled (if it needs any of the files that haven't been downloaded because they already exist). Cryptographic signature verification is performed. If remaining disk space is larger than a particular margin, e.g. 20%, then the context containing the older version of the activity bundle is kept around, and the user given the ability to perform rollback on the activity update. Otherwise, the old version bundle is destroyed.




Author
    Ivan Krstić
    ivan AT laptop.org
    One Laptop per Child
    http://laptop.org
Metadata
    Revision: Draft-14
    Timestamp: Tue Jun  26 17:51:45 UTC 2007


END