User:DanielDrake/Yum

From OLPC
Jump to: navigation, search

OLPC software releases are based on Fedora, which uses RPMs and the yum package manager as a cornerstrone of the project. However, in the OLPC case, yum drops out of OLPC's software vision as soon as the early stages of the software build process have run. This page attempts to explain:

  • Why yum doesn't fit into OLPC's software vision
  • Why yum doesn't fit into the OLPC deployment model as a system update tool, resulting in OLPC developing and maintaining its own update system
  • Why individual users might face occasional problems with yum, even though it is a useful individual/developer tool and works fine most of the time
  • The downsides of the current OLPC approach

Yum vs the build system

Yum is extensively used early in the software build system. We acquire packages from official Fedora yum repositories and yum repositories of OLPC's own packages, and then install them.

After yum has been used to install software packages within the under-construction software image root, our build system applies a whole series of modifications (some hacky, some clean) to the files that were installed by the packages. These modifications result from unique requirements of our software (our users and deployment environments differ greatly from the larger Fedora userbase). It would be preferable to have no modifications at all, and while our set of modifications is shrinking, this is a very difficult goal to achieve.

So, if you install a software image generated by the build system on your XO, and then use yum to update or install software, there is a good chance you are going to be overwriting some of the late modifications that were applied by the OLPC build system. Sometimes these modifications are only of minor importance (e.g. small optimizations, space saving techniques) but in other cases they are critical for the functionality of some part of the system.

So, while yum is a useful utility for developers and individual users to install some software of their choice or update existing packages, installing software through yum will occasionally cause loss of functionality. This is also one reason why Yum is not pushed to deployments.

Yum's inherent problems as a software updater

OLPC continues to focus on its goals to produce software which will scale massively. We're talking millions of laptops per country, many hundreds of laptops within small spaces. Our users are 6 through 12 years old. Environments are tough, electricity is unreliable. Connectivity is low, slow, high latency, and unreliable.

Perhaps the biggest showstopper for OLPC to adopt yum/RPM as a software update tool is the fact that it is inherently non-atomic. If you lose power in the middle of an update, you are likely to end up with problems. Sometimes these problems can be solved by Fedora's repair tools which try to replay RPM transactions, but sometimes they can't (or they need user intervention to fix, we can't ask this from a young child). Sometimes the problems are grave enough that you can't boot to run such repair tools.

Yum is inefficient when performing software updates. When the original decision was made not to use yum, yum would always download the full package contents on every update even if only 1% of the data had changed. In more recent times, yum uses delta's, but these only apply for certain packages on certain upgrade paths (see this article for some discussion).

Additionally, after installing a new RPM, the post-upgrade tasks are often very inefficient. For example, after updating a package which updated a plugin or something, entire caches are regenerated, including of the parts that did not change in the update.

Yum does not directly offer a mechanism to update from one Fedora release to another (it is possible, but heavily discouraged by Fedora developers). Instead, you must use a separate tool (Preupgrade/anaconda). When using preupgrade, deltas aren't used, you are forced to download the entirity of everything, even for packages which have barely changed from one release to the next. Anyone who has used preupgrade will know that it takes hours to complete, even on powerful machines, and you cannot use the system at all during the upgrade process. If you lose power during the final stage of the preupgrade process, you're in trouble (this once happened to me and after a couple of hours of trying to recover I gave up and did a fresh reinstall).

Yum can't remove software. You may have shipped some software that you now want removed, but yum doesn't directly offer functionality here.

Yum downloads databases consisting of every single package you could possibly install, even when there is just 1 update pending. It also downloads multiple databases, where much of the content of some databases (e.g. the Fedora release db) is deprecated by the contents of others (e.g. Fedora updates db).

For more depth, see the update mechanism for new releases discussion (June 2009).

olpc-update design and features

Recognising the need for a system update tool despite yum not being suitable, OLPC designed and implemented olpc-update which avoids the problems described above as follows:

  • olpc-update-query queries for new updates very efficiently. It uses a hash (a representation of version number) to query an update server in a small request, and the update server returns a short message informing the system if an update is available (this is part of the theft deterrence protocol). The overall communications stay well below 1kb/day when no update is available.
  • olpc-update is interruptible. Interrupt it while it is downloading updates or verifying updates and there is no harm done - guaranteed. Next time it runs it will continue where it left off.
  • olpc-update is atomic. The final step of the update process is an atomic change to the destination of a symbolic link. No matter at which point that power is lost, you either get the working old system or the working new one.
  • Through the use of rsync and comparison of the contents manifest files, olpc-update is truly incremental, only updating what has changed with the new updates.
  • Post-upgrade tasks which run slowly under yum/preupgrade are done by the system that built the update (exploiting the fact that all laptops in a deployment have the same software); that inefficiency is not carried over. olpc-update just copies in the new files and is done.
  • Updates can delete software, this is a natural feature of the implementation

Furthermore, olpc-update brings in some new features:

  • olpc-update-query is automated out of the box. The deployment controls which servers are queried by olpc-update-query (can be a mixture of internet servers and the local school server). The school server runs an update server out-of-the-box too, updates can be loaded simply by plugging in a USB disk with the system update files.
  • Updates, big or small, happen in the background and the system remains fully functional during this process.
  • olpc-update-query is linked into the security system. The system will only accept updates signed with the deployment's security keys. Yum also has key-based authentication but a lot of work would be needed on both client-side and server-side to get it to use OLPC's well-established key infrastructure and distribution systems.
  • Through marking your currently booted system as sticky, you can keep 2 or more versions of the operating system on the same disk, switching between them at boot-time with a cheat code.

Yum vs olpc-update

This section explains happens if you use yum on an XO laptop where olpc-update will possibly be used (either by the user, or automatically and in the background via olpc-update-query) in the future:

  • "Pristineness" will be lost, as the package database gets modified (and your actions may result in other changes too). olpc-update will now take more than twice as long: it will download the parts that changed in the new version, but realise that the end result does not exactly match what was published by the update server, so it will then regenerate a local contents manifest and use that as a basis of pulling in updates.
  • Any packages you added will get lost on update, unless the update includes those packages. Perhaps not such a big deal for individual users who run olpc-update manually, but if yum usage were to become common practice in a deployment where olpc-update-query is used to silently and automatically push out OS updates, users and teachers might become quite confused as their software mysteriously disappears every now and then.

Disadvantages

Of course, the choice of olpc-update over yum and the design that it entails brings some disadvantages:

  • It is obviously desirable to allow users to install their own software, yum is an obvious choice for this especially under GNOME, but yum usage is incompatible with olpc-update. (Sugar solves this by installing activities outside of the package manager in /home/olpc, but this simplistic approach is now causing growing pains for SugarLabs).
  • olpc-update brings along a filesystem layout where everything lives under /versions, and some clever initramfs bind-mount and chroot tricks makes this somewhat invisible. But it can be confusing for new developers to realise this and then understand it.
  • The /versions filesystem layout brings about 20mb overhead through the way it functions (masses of hardlinks)
  • olpc-update requires that you have enough disk space to store the current OS and the changes brought in by the new OS (for approximately 1 minute, after that the older version is automatically erased). So if you don't have much disk space, olpc-update won't work (but your system won't be inversely affected). Disk space requirements are therefore quite high when a change between base Fedora versions is involved. Yum/preupgrade is worse off in this respect, needing to temporarily store an entire copy of the new system, including duplication of the bits that don't change.