Software ECO process

From OLPC
Jump to: navigation, search

For current information on OLPC software release processes, see Release Process Home

Stop hand.png WARNING:
The content of this section is considered
DEPRECATED and OBSOLETE
It is preserved for historical or documenting reasons.

  This page is monitored by the OLPC team.

The software ECO process is a process for producing software releases meeting an established quality standard. ECO, or engineering change order, is a term from the hardware world describing the order needed to place a new revision of a product into production.

From time to time there may be critical bug fixes that must be released before the next normal, scheduled releases. In order to ensure that the quality of our released software continually increases, the following procedure should be strictly followed when proposing, creating, and publishing an unscheduled software release.

For process documentation, a copy of the USR Checklist should be made and completed as each step is completed. Be sure to use the 'Protect' tab on your checklist page. to ensure that edits are done by logged-in users only.

Process Overview

  1. PROPOSAL: Anyone who sees a problem they believe to be critical can initiate the email to the software-eco list at laptop.org to start this discussion (when appropriate, please CC the devel list). This ensures notification of anyone who may need to be involved in this process. Please fill in as much as possible of the information requested below. (The request should include a reference to an USR Proposal wiki page (See OLPC SW-ECO 2 as an example.))
  2. REVIEW PROPOSAL: When the discussion has resulted in a concrete proposal, the Triage team will review the proposal, possibly making changes to broaden or limit the scope of the USR, approve specific strategies for addressing the issues necessitating the USR, etc. When the Triage team has approved the USR, they will communicate the approval via email to software-eco, referencing the specific version of the wiki page approved (again, please CC devel when appropriate). Triage team approval commits both development and testing resources to work on the USR.
  3. CANDIDATE BUILD: When the proposal has been approved, the build master will prepare a build or series of builds as candidates for the USR, updating the USR wiki page with the changed package(s) included in the USR builds, and the rationale for their inclusion. The build master will announce the availability of each candidate build.
  4. TEST: When candidate build(s) are made available, bug-specific tests as well as the 1 hour Smoke test for the release should be conducted. The results of these tests should be included as part of the USR wiki page. Test results for each build should be announced. Test results may necessitate additional builds, which will require restarting the process from CANDIDATE BUILD. If the results of testing necessitate changes to the USR proposal that affect its scope, the process should be restarted from the proposal review step. The TEST step is complete when the QA lead signs off on the candidate build.
  5. SIGNATURE: When the candidate build has completed testing, a security and appropriateness review of the candidate build is performed. If approvals of each component are obtained, the candidate build is then signed. If the review uncovers problems, the process should be restarted from CANDIDATE BUILD with appropriate changes to security-sensitive components. The availability of the signed build is announced.
  6. FINAL TEST: When the signed build is made available, final testing is performed to ensure that the signature process was successful and that operation and upgrade on secured machines is correct. The QA lead signs off on the results of final testing. The USR wiki page is reviewed and verified to accurately describe the final build. The Release Team signs off on the final USR.
  7. RELEASE: When Release Team and QA lead have signed off on the FINAL TEST step, the build engineer uploads the signed build to download.laptop.org and updates the appropriate wiki pages to list the new build. An appropriate subset of the USR wiki page should be converted into release notes for the USR. The release engineer then notifies all appropriate/affected parties with information on what they need to do to distribute and apply the release. Note that the targets of a USR may be a small subset of our users.

Process step details

PROPOSAL

Anyone who sees a problem they believe to be critical can initiate the email to the software-eco list at laptop.org to start this discussion (when appropriate, please CC the devel list). This ensures notification of anyone who may need to be involved in this process. The proposal should meet the criteria below, and adhere to the format described in the next section.

Proposal Criteria

  • Security issue that threatens anti-theft or child safety; or a known exploit that threatens a large number of laptops;
  • Datastore, Journal, or file-system problems resulting in loss of data;
  • Bug resulting in laptop crashes more frequently than 1/day with typical use;
  • Bug resulting in serious disruptions to other services (non-XO);
  • A core feature or functionality that is not working or fails (e.g., Browse, Write, Read, Chat, power management, telepathy);
  • Updates to support new versions of hardware (e.g. keyboards, revisions to major components) that cannot wait for the next major software release

(There is a discussion of these criteria here.)

How to Propose

Proposals for unscheduled releases should be submitted to the "software-eco" mailing list, with a "collector" or summary Trac bug that links to any and all other Trac bugs and a link to a wiki page for collecting as much of the following information as is available (See OLPC SW-ECO 2 for an example):

  • Title of the Release: descriptive of the major driving force for this release;
  • Trac items: detailed description of the issue;
  • Priority: the believed urgency of the issue, including any deadlines;
  • Root Cause: why did this occur?
  • Effect from the user perspective: How many are affected? What does the user see? how does it affect them? Is there a work-around? What are the consequences of not fixing it?
  • Proposed Fix: the complete source-code and package-level diffs for review, if not clearly stated in referenced Trac bugs;
  • Reviewers: who has reviewed the proposed changes for correctness? (Preferably at least three people competent in the area affected other than the authors of the changes.)
  • Proposed Testing: developer testing, QA testing, multi-language, boot up, upgrade testing, etc. (beyond the usual 1 Hour Smoke test);
  • Proposed Rollout: Mfg, Support group, G1G1 users, country deployment teams, etc.

At the time of the initial proposal, a fix (or the "best" fix) may not be known.

REVIEW PROPOSAL

In order to ensure that proposal solves problems of a truly urgent nature and in order to properly schedule the resources required to fix the stated problems, the Triage Team will carefully and publicly (if possible) review the submitted proposal before committing resources toward its realization.

The Triage Team will consider the following review criteria as they make their decision:

  • TODO

The Triage team will review the proposal, possibly making changes to broaden or limit the scope of the USR, approve specific strategies for addressing the issues necessitating the USR, etc. A member of the Triage team will be named as the Champion of the USR, responsible for shepherding the proposal through the release process. Initial approval of the USR, communicated via email from a member of the Triage team, commits both development and testing resources to work on the USR. Fixes to address the issues necessitating the USR will be developed, and documented on the USR wiki page. When appropriate fixes are approved, the Triage team will communicate the approval via email to software-eco, referencing the specific version of the wiki page approved (again, please CC devel when appropriate). This will start the next step in the process.

CANDIDATE BUILD

Upon notification from the triage team, the build master will prepare a build or series of builds as candidates for the USR, updating the USR wiki page with the changed package(s) included in the USR builds, and the rationale for their inclusion. The build master will announce the availability of each candidate build.

TEST

When candidate build(s) are made available, bug-specific tests as well as the 1 Hour Smoke test should be conducted. The results of these tests should be included as part of the USR wiki page. Test results for each build should be announced. Test results may necessitate additional builds, which will require restarting the process from CANDIDATE BUILD. If the results of testing necessitate changes to the USR proposal that affect its scope, the process should be restarted from the proposal review step. The TEST step is complete when the QA lead signs off on the candidate build.

The testing for this build will include both specific and general tests. Once the release master has created a candidate build (unsigned) for testing, testers can go through the test plan.

Specific Testing

  1. The build will be tested using the information contained in the Trac bug(s) to ensure specific fixes.

General Testing

  1. The versions of OFW and the wireless firmware should be checked.
  2. The build must be installed on MP and B4 systems successfully.
  3. Upgrades from the previous release are successful (See Things to test after updates).
  4. Downgrade to the previous release are successful.
  5. A fresh install (as in manufacturing) is successful on both MP and B4 systems;
  6. The Smoke test must be performed, both using a fresh installation and an upgraded installation, looking specifically for regressions from the release reports, and thinking about possible interactions a fix might cause. Fixes to core technologies may require much more extensive testing and soaking, as some failures only occur after time or use;
  7. More than one SKU and keyboard type must be used during this testing (both MP and B4, due to differences in manufacturing data) in order to catch regressions in keyboard identification
  8. Wireless should be tested against open, WEP, and WPA access points (See Wireless testing)
  9. Any new hardware support must be tested explicitly (e.g. new keyboard type, new revision of a component);
  10. When time permits, test builds should be used for testing by developers in the field to confirm the fixes;

If new problems (not previously known as part of testing or bug reports) are discovered during the test process, they must be analyzed to root cause and the software-eco mailing list informed of the findings, and a new decision made on proceeding. These would normally be latent bugs, and the criterion for proceeding (waiving) the bug should be based on a judgment that the candidate build is at a minimum no worse than the previous build.

SIGNATURE

When the candidate build has completed testing, a security and appropriateness review of the candidate build is performed. If approvals of each component are obtained, the candidate build is then signed. If the review uncovers problems, the process should be restarted from CANDIDATE BUILD with appropriate changes to security-sensitive components. The availability of the signed build is announced.

Key in any signing decision should be, "does this build, or could this build, compromise antitheft/activation security"? Changes in the firmware, kernel, olpcrd could potentially compromise security; in general other changes are "safer".

Signature Checklist

  1. Security sign-offs on olpcrd, kernel, OFW, and EC? These are central to our security system. FUTURE: an additional audit will be required by the security team, by different individuals than the ones who made the changes.
  2. Do the versions of olpcrd, kernel, OFW, and EC match those signed-off?
  3. Was the testing of the candidate release successful or was a newly discovered problem waived (per above)? (Whenever practical, the build will have also been tested by a significant number of users in the field to confirm the fixes are correct.) Build to be signed should have a sign-off from the QA lead.
  4. FUTURE: Are all source packages are present and accounted for?
  5. FUTURE: Are all packages were build on the correct OLPC controlled build system(s)?
  6. Are only packages fixing the referenced bug(s) changed by the build that is to be signed?

Only after this checklist is complete should the build should the build be signed and then made available in the 'candidate' directory on download.laptop.org for release candidates.

The build master is responsible for ensuring the checklist and that the process has been followed.

Note that signing the build for release is not the end of the process: the signed release must see a final verification step on write protected systems.

FINAL TEST

When the signed build is made available, final testing is performed to ensure that the signature process was successful and that operation and upgrade on secured machines is correct. The QA lead signs off on the results of final testing. The USR wiki page is reviewed and verified to accurately describe the final build. (In particular, the checklist contains spaces for a final sanity-check of firmware versions.) The Release Team signs off on the final USR.

  1. The signed build needs to be installed on a write-protected laptop via USB stick—fresh install as well as upgrade;
  2. The signed build needs to be upgrade from the previous stable build via network update;
  3. The signed build must be automatically upgraded from a central server;
  4. The signed build can be downgraded to its previous signed build;
  5. If a candidate build fails the testing, the candidate must be removed from the candidate directory.

RELEASE

When the Release Team and QA lead have signed off on the FINAL TEST step, the build engineer uploads the signed build to download.laptop.org and updates the appropriate wiki pages to list the new build. An appropriate subset of the USR wiki page should be converted into release notes for the USR. The release engineer then notifies all appropriate/affected parties with information on what they need to do to distribute and apply the release. Note that the targets of a USR may be a small subset of our users.

The final steps are to be performed after FINAL TEST has been signed off:

  1. Move the build from 'candidate' to 'official' on download.laptop.org (removing the candidate entirely);
  2. Notify the Quanta ECO mailing list, if indicated, preferably using signed email, and certainly containing checksums of the build and the URL at which the official bits can be found, and if/when the build should be phased into production. An explicit judgment as to whether the build should immediately go into production is required;
    • Concrete example: if the ECO were to fix problems with a keyboard not currently being produced, disturbing production would be very unwise;
  3. Notify the software-eco and devel mailing lists that this new build is available providing the link to the wiki page as release notes (or create a release notes page).

Dramatis Personae

Triage Team

Kim Quirk, Jim Gettys, John Watlington, Greg Smith

Release Team

Michael Stone, C. Scott Ananian, Kim Quirk, Jim Gettys, Greg Smith