User:DanielDrake/NewMockProposal

From OLPC
< User:DanielDrake
Revision as of 16:49, 18 February 2011 by DanielDrake (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

In February 2011, this proposal was decided against. It was decided that the disadvantage of release candidates becoming more difficult to reproduce was too much, and we agreed on an alternative solution to the ugly fat git repo: split each branch into an individual repo. New disks were installed, these changes were made, and the system is now documented at Frozen repositories.

Stop hand.png WARNING:
The content of this section is considered
DEPRECATED and OBSOLETE
It is preserved for historical or documenting reasons.

mock.laptop.org is our "frozen repository" host.

The normal OLPC OS development process starts with RPMs coming in from Fedora's core, updates, and updates-testing repositories. There is no control on which packages go in and out, and this is only loosely monitored. Each development build simply takes the latest that Fedora is offering.

Similarly, OLPC's "public" dev.laptop.org server allows trusted contributors to place RPMs in their home directory, which then get aggregated at (e.g.) http://xs-dev.laptop.org/~dsd/repos/f11/. These packages are not vetted and changes are only loosely monitored.

Later in the development cycle, OLPC decides to freeze the packages going into the build, to provide a more suitable environment for testing. At this point, an OLPC developer clones the packages being used onto mock.laptop.org in a manner which is mostly undocumented, and then all development builds use mock.laptop.org rather than Fedora (and the xs-dev OLPC repository) as their source of packages.

During the last few weeks of development, new RPMs are added to these repositories by hand via other undocumented processes, but the rate of change is typically low.

mock.laptop.org retains these frozen RPM trees forever, providing a means for the release to be exactly rebuilt from the same set of packages, and providing a means for our customers to retrieve the source code on which the system is built.


Existing setup and problems

mock.laptop.org has a single git tree of various distinct branches, including entire copies of Fedora, fedora updates, and OLPC-added RPMs.

Many of these branches have very little common history, because every time the Fedora version changes, close to 100% of RPMs are changed.

This git tree is really huge and is quite difficult to work with (git is not good with binary files, and there's a lot of history, I've lost entire afternoons to this system). Given the lack of common history between branches, it seems like an odd model too.

The git tree is made available over gitweb, and some clever lighthttp configuration makes regular http:// requests be passed through gitweb's "blob" view.

We are now out of disk space on the existing system. The git repo is more than 100GB in size.

Proposed system

This proposal may seem a bit long, but the content is not complicated. The system is a bit more complex than the previous one in that it is not as unified (different repositories are maintained in different ways), but its components remain quite simple.

Core Fedora repositories

There is no reason to have the core Fedora repos (the ones made at point of Fedora release) under version control, because they are fixed by definition. They don't change, and we don't modify them.

So, Fedora repositories will be stored on disk outside of revision control, in the following structure:

  • RPMS/ (copy of all the RPMs)
  • SRPMS/ (copy of all the SRPMs)
  • repodata/ (generated by createrepo)
  • comps.xml (copied from Fedora)

Delta RPMs and debuginfo packages are not included.

There are sometimes a handful of packages that don't change between Fedora releases, so these repositories should be constructed using hard links based on the previous Fedora release present on this server, before using rsync to download the new Fedora release.

Names of future repositories will include the architecture (thinking about ARM builds in the future), e.g. koji.dist-f14-i686

So the process for Fedora 14 will be:

# cp -a -l koji.dist-f11 koji.dist-f14-i686
# cd koji.dist-f14-i686
# rsync -av --progress --delete rsync://mirrors.rit.edu/fedora-enchilada/linux/releases/14/Everything/i386/os/Packages/ RPMS/
# rsync -av --progress --delete --exclude repodata rsync://mirrors.rit.edu/fedora-enchilada/linux/releases/14/Everything/source/SRPMS/ SRPMS/
# rm -f comps.xml
(now download new comps.xml manually from http://mirrors.rit.edu/fedora/linux/releases/14/Everything/i386/os/repodata/
# createrepo -g comps.xml -d .

Fedora-updates repositories

Fedora updates are also included in OLPC OS releases, generally taken as a snapshot at a particular point in time (quite late in the release process). OLPC OS development is based on updates-testing, therefore when a snapshot of fedora-updates is taken, a few packages must be brought in manually from updates-testing in order to ensure a consistent build with the development builds that came before it.

As these trees are largely static, they will not be kept in revision control. The process to create such a snapshot is once again based on a hardlinked tree of RPMs from the previous fedora-updates repository, for purposes of disk space saving.

The structure is as follows:

  • RPMS/ (copy of all the RPMs)
  • SRPMS/ (copy of all the SRPMs)
  • repodata/ (generated by createrepo)
  • comps.xml (copied from Fedora)
  • ChangeLog (hand written by OLPC developer)

Delta RPMs and debuginfo packages are not included.

For example, for 11.2.0's snapshot of fedora-updates:

# cp -a -l koji.dist-f11-updates-10.1.2 koji.dist-f14-i686-updates-11.2.0
# cd koji.dist-f14-i686-updates-11.2.0
# rsync -av --progress --delete --exclude debug --exclude repodata --exclude drpms rsync://mirrors.rit.edu/fedora-enchilada/linux/updates/14/i386/ RPMS/
# rsync -av --progress --delete --exclude repodata rsync://mirrors.rit.edu/fedora-enchilada/linux/updates/14/SRPMS/ SRPMS/
# rm -f comps.xml
(now download new comps.xml manually from http://mirrors.rit.edu/fedora/linux/updates/14/i386/repodata/ )
# createrepo -g comps.xml -d .
# rm -f ChangeLog

Now write an initial ChangeLog entry, including your name, the time and date, and the fact that you pulled in updates.


Now we must account for any packages from updates-testing which are being shipped in the latest development release before the switch to frozen repos. The easiest way to do this is to produce a local OS build based on the frozen repos, and then do a diff of the packages.txt output against the one from the latest development build. Look for differences where packages are being downgraded in your local build. You then know which packages need syncing in from updates-testing.

This is a manual process but usually not many packages fall into this category (I think it was about 10 last time). After examining the diff to see which packages need pulling, the process for each package is:

(from inside the koji.dist-f14-i686-updates-11.2.0 directory created above)

  1. If there is an existing (but older) update for the package, remove it from RPMS and SRPMS
  2. Download the desired version from http://koji.fedoraproject.org - put the .rpm in RPMS and the .srpm in SRPMS (for packages with subpackages, all subpackages must be synced like this)

Finally:

# createrepo -g comps.xml -d .

Now edit the ChangeLog, add a new entry with your name, the time and date, and a list of packages that you manually pulled in.

As development continues, new Fedora updates will occasionally want to be brought in in response to specific bugs. The above process (of syncing in packages from updates-testing) can be followed. The ChangeLog entry that you add should include a reference to the ticket that is being fixed.

OLPC-local repositories

Unlike Fedora, the packages that OLPC adds are not built by a centralized build system which archives old versions. And unlike the Fedora repository clones, it is quite likely that there will be a fair degree of change in these repositories during the final weeks of development. So, revision control is called for.

As before, there will be one repository for "generic" OLPC packages, and then one repository per targetted laptop model for the laptop-specific packages. However, unlike before, each of these repositories will be separate git trees (after all, they do not share history). These repositories will be created on the server by the release manager.

The layout of each repository will be:

  • RPMS/
  • SRPMS/
  • repodata/ (created by createrepo)

These repositories will be available over git:// and ssh://. Contributors will have push access to the relevant repositories (but this task will usually be handled exclusively by the release manager).

It is slightly painful to work with large files in git, but these repositories will be short lived (in that development on each one will span only a few months; they will however remain available forever). Unlike the old system, each individual RPM repository gets its own git repository.

Adding, updating and removing packages is a simple matter of maintaining the files in the RPMS directory, while carefully keeping SRPMS synchronized as well. "createrepo ." must be run before every commit - as before, the machine-generated repodata is kept in git (a possible area of improvement for the future).

No ChangeLog file is maintained; instead, each commit message should include sufficient justification for the package change. Usually just a descriptive summary line plus a reference to a ticket on OLPC's trac is enough.

Publication of repositories

The "koji." repositories will be kept in an area of the filesystem made available over http via apache, and additionally over public rsync://.

The "local." git repositories will be kept in a dedicated directory, with a cgit install (over apache) allowing web-view. Rewrite/redirect rules will be put in place to make these appear as fully checked-out repositories over http. git-daemon will make them available over git://.

For reference, the existing system uses the following lighttpd configuration in order to make the git repository contents available over http. This would be used as a base for the new rules:

url.rewrite = ( 
"^/repos/([^/]+)/(.+)$" => "/gitweb/gitweb.cgi?p=repos;a=blob_plain;hb=$1;f=$2",
)
url.redirect = (
"^/repos/([^/]+)/?" => "/gitweb/gitweb.cgi?p=repos;a=tree;h=$1;hb=$1",
"^/repos/?" => "/gitweb/gitweb.cgi?p=repos",
"^/$" => "/gitweb/",
)

Documentation

The above processes, along with a little more procedural information, will be publicly documented on the OLPC wiki.

Migration of old system

mock.laptop.org has the purpose of enabling perfect reconstruction of OS builds, so we shouldn't lose old repositories in the process. My proposal is that we move them over to the new system, and set up any necessary DNS or mod_rewrite tricks so that old URL's continue working.

A backup of the old repos .git should be kept for some time, just in case.

"koji." branches

The structure of these branches is already as described above, so we simply need to check out the files from each branch and generate a changelog, e.g.:

# cd /path/to/old/repos.git
# git checkout koji.dist-f11
# mkdir /var/www/repos/koji.dist-f11
# cp -a * /var/www/repos/koji.dist-f11
# git log > /var/www/repos/koji.dist-f11/ChangeLog

After checking out every branch in that fashion, there may be some duplication of files. This can be resolved by running the "hardlink" tool as a one-off (available via yum), "a program which consolidates duplicate files in one or more directories using hardlinks."

"local." branches

We must now convert each of the "local." branches into individual repositories. This can be done quickly by creating blank git repositories, then pushing the individual branches to the blank repositories. For example:

# mkdir -p /var/git/local.10.1.3
# cd /var/git/local.10.1.3
# git --bare init
# cd /path/to/old/repos.git
# git checkout local.10.1.3
# git push /var/git/local.10.1.3 master

Disadvantages of new system

Previously, it was possible to obtain a single git repository including all of the repositories and source that you would need to reproduce an image. Under the new system, you must obtain various static repositories and git repositories to do this.

This seems to be an infrequently used opportunity for deployments. I don't think that this is a major loss of functionality, the inverse can also be argued: it's easier for deployments to clone a number of small repos rather than to clone a 100GB beast including a lot of irrelevant data.

Previously, it was possible to reconstruct release candidate images, because each release candidate was tagged in git on each branch with a build number. More recently, the tagging has not taken place, but nevertheless it would not be overly difficult to identify the point on each branch where each release candidate was built.

This is a loss of functionality on the new system, but I think it is acceptable. I don't see demand to reconstruct release candidate images. And it would still be technically possible - each release candidate is published along with a packages.txt file that identifies all packages in the build. If exact reconstruction was necessary, that should provide a suitable base.

Logistics

Someone (Chris Ball?) will coordinate the availability and installation of a new server to run these services. I suggest that at least 300GB of disk space is available (enough to accomodate the old data, and provide room for another few years of development).

Daniel Drake will configure the system as above, perform the migration from the old repos.git, and produce documentation. As a test, 10.1.3 for XO-1.5 will be reconstructed, and a full diff of the resultant trees should confirm that the builds are as identical as ever.

When that is done, the new system will be made active.