Software ECO process: Difference between revisions
DanielDrake (talk | contribs) No edit summary |
|||
(114 intermediate revisions by 15 users not shown) | |||
Line 1: | Line 1: | ||
<b><big><font color=red>For current information on OLPC software release processes, see [[Release Process Home]]</font></big></b> |
|||
{{Deprecated}} |
|||
{{OLPC}} |
{{OLPC}} |
||
{{Developers}} |
|||
{{Procedures}} |
|||
The '''software ECO process''' is a process for producing software releases meeting an established quality standard. ECO, or engineering change order, is a term from the hardware world describing the order needed to place a new revision of a product into production. |
|||
=DRAFT Operating System Release Process= |
|||
==Patch Release Process== |
|||
From time to time, there may be critical bug fixes that must be released before the next normal, scheduled releases. These can occur due to security issues, from unexpected hardware problems, or the discovery of latent bugs that affect large numbers of users. |
|||
From time to time there may be critical bug fixes that must be released before the next normal, scheduled releases. In order to ensure that the quality of our released software continually increases, the following procedure should be strictly followed when proposing, creating, and publishing an unscheduled software release. |
|||
===Proposals for Patches=== |
|||
Proposals for patches will be submitted to the "eco-software" mailing list, with a bug number in the trac system. This bug must describe: |
|||
For process documentation, a copy of the [[USR Checklist]] should be made and completed as each step is completed. Be sure to use the 'Protect' tab on your checklist page. to ensure that edits are done by logged-in users only. |
|||
(This stuff should be a link to a template) |
|||
* Title of Patch (should be a short description of the major driving force for this patch) |
|||
* Trac items: description of the issue |
|||
* Priority: the believed urgency of the fix, including any deadlines |
|||
* Root Cause: why did this occur |
|||
* Effect, user perspective: How many are affectede? What does the user see, how does it affect them? Is there a work-around? Consequences of not fixing it |
|||
* Proposed Fix: the patch(s) for review, if not clearly stated in referenced bugs |
|||
* Reviewers: who has reviewed the patch(s) for correctness (preferably at least three people competent in the area affected) |
|||
* Proposed Testing: developer testing, QA testing, multi-language, boot up, upgrade testing, etc. |
|||
* Proposed Rollout: Mfg, G1G1 users, country deployment teams, etc. |
|||
===Process Overview=== |
|||
Discussion can then ensue in trac. If there is no consensus that the patch should be deployed, a decision will be made by Jim Gettys, Kim Quirk, and Walter Bender. |
|||
# [[#PROPOSAL|PROPOSAL]]: Anyone who sees a problem they believe to be critical can initiate the email to the software-eco list at laptop.org to start this discussion (when appropriate, please CC the devel list). This ensures notification of anyone who may need to be involved in this process. Please fill in as much as possible of the information requested [[#PROPOSAL|below]]. (The request should include a reference to an USR Proposal wiki page (See [[OLPC SW-ECO 2]] as an example.)) |
|||
Security patches cannot, unfortunately, follow this path due to disclosure rules, but will occur in a similar fashion on a closed security mailing list and additionally include Ivan Krstic in the decision. |
|||
# [[#REVIEW PROPOSAL|REVIEW PROPOSAL]]: When the discussion has resulted in a concrete proposal, the [[#Triage Team|Triage team]] will review the proposal, possibly making changes to broaden or limit the scope of the USR, approve specific strategies for addressing the issues necessitating the USR, etc. When the Triage team has approved the USR, they will communicate the approval via email to software-eco, referencing the specific version of the wiki page approved (again, please CC devel when appropriate). Triage team approval commits both development and testing resources to work on the USR. |
|||
# [[#CANDIDATE BUILD|CANDIDATE BUILD]]: When the proposal has been approved, the build master will prepare a build or series of builds as candidates for the USR, updating the USR wiki page with the changed package(s) included in the USR builds, and the rationale for their inclusion. The build master will announce the availability of each candidate build. |
|||
# [[#TEST|TEST]]: When candidate build(s) are made available, bug-specific tests as well as the 1 hour [[Smoke test]] for the release should be conducted. The results of these tests should be included as part of the USR wiki page. Test results for each build should be announced. Test results may necessitate additional builds, which will require restarting the process from [[#CANDIDATE BUILD|CANDIDATE BUILD]]. If the results of testing necessitate changes to the USR proposal that affect its scope, the process should be restarted from the [[#REVIEW PROPOSAL|proposal review]] step. The [[#TEST|TEST]] step is complete when the QA lead signs off on the candidate build. |
|||
# [[#SIGNATURE|SIGNATURE]]: When the candidate build has completed testing, a security and appropriateness review of the candidate build is performed. If approvals of each component are obtained, the candidate build is then signed. If the review uncovers problems, the process should be restarted from [[#CANDIDATE BUILD|CANDIDATE BUILD]] with appropriate changes to security-sensitive components. The availability of the signed build is announced. |
|||
# [[#FINAL TEST|FINAL TEST]]: When the signed build is made available, final testing is performed to ensure that the signature process was successful and that operation and upgrade on secured machines is correct. The QA lead signs off on the results of final testing. The USR wiki page is reviewed and verified to accurately describe the final build. The [[#Release Team|Release Team]] signs off on the final USR. |
|||
# [[#RELEASE|RELEASE]]: When Release Team and QA lead have signed off on the [[#FINAL TEST|FINAL TEST]] step, the build engineer uploads the signed build to download.laptop.org and updates the appropriate wiki pages to list the new build. An appropriate subset of the USR wiki page should be converted into release notes for the USR. The release engineer then notifies all appropriate/affected parties with information on what they need to do to distribute and apply the release. Note that the targets of a USR may be a small subset of our users. |
|||
==Process step details== |
|||
===Proposals for a Patch Release=== |
|||
Proposal for a patch release will also be made to the "eco-software" mailing list, enumerating exactly which bug(s) will be closed. When approved (?what should constitute approval)? patch builds will begin. |
|||
===PROPOSAL=== |
|||
===Testing of the Patch Release=== |
|||
Anyone who sees a problem they believe to be critical can initiate the email to the software-eco list at laptop.org to start this discussion (when appropriate, please CC the devel list). This ensures notification of anyone who may need to be involved in this process. The proposal should meet the criteria below, and adhere to the format described in the next section. |
|||
====Proposal Criteria==== |
|||
# The release master will then put together a candidate build |
|||
# The problem the fix ostensibly fixes will be tested using the information contained in the trac bug(s) |
|||
# Upgrades at least from the previous signed release (and previous signed release?) are successful. (should we explicitly compare the result of the upgrade against a fresh installation?) |
|||
# The [[1 Hour Smoke Test]] must be performed, both using a fresh installation and an upgraded installation, looking specifically for regressions from the release reports, and thinking about possible interactions a fix might cause. |
|||
# More than one SKU and keyboard type are to be used during this testing, to catch regressions in keyboard identification |
|||
# Any new hardware support must be tested explicitly (e.g. new keyboard type, new revision of a component). |
|||
# When time permits, test builds should be used for testing by developers in the field to confirm the fixes. |
|||
# If <i>new</i> problems (not previously known as part of testing or bug reports) are discovered during the test process, they <b>must</b> be analyzed to root cause and the eco-software mailing list informed of the findings, and a new decision made on proceeding. These would normally be latent bugs, and the criterion for proceeding (waiving) the bug should be based on a judgment that the patch build is at a minimum no worse than the previous build |
|||
* Security issue that threatens anti-theft or child safety; or a known exploit that threatens a large number of laptops; |
|||
===Signing of Builds for Release=== |
|||
* Datastore, Journal, or file-system problems resulting in loss of data; |
|||
Key in any signing decision should be, "does this build, |
|||
* Bug resulting in laptop crashes more frequently than 1/day with typical use; |
|||
or could this build, compromise antitheft/activation security"? |
|||
* Bug resulting in serious disruptions to other services (non-XO); |
|||
Changes in the firmware, kernel, olpcrd could potentially compromise |
|||
* A core feature or functionality that is not working or fails (e.g., Browse, Write, Read, Chat, power management, telepathy); |
|||
security; in general other changes are "safer". |
|||
* Updates to support new versions of hardware (e.g. keyboards, revisions to major components) that cannot wait for the next major software release |
|||
(There is a discussion of these criteria [[Talk:Unscheduled_software_release_process|here]].) |
|||
====Release Checklist==== |
|||
# Have the olpcrd, kernel, firmware changed? These are central to our security system, and an additional audit is required by the security team, by different individuals than wrote the patch |
|||
# The testing of the patch release succeeded or a newly discovered problem waived (per above) |
|||
# All source packages are present and accounted for |
|||
# All packages were build on the correct OLPC controlled build system(s) |
|||
# Only packages fixing the referenced bug(s) are changed by the build that is to be signed |
|||
# The patch candidate has been installed on MP, CTest and B4 systems successfully |
|||
# Was the correct version of OFW included in the build? |
|||
# Whenever practical, the build will have also been tested by a significant number of users in the field to confirm the fixes are correct. |
|||
====How to Propose==== |
|||
The build master is responsible for ensuring the checklist and that the process has been followed. |
|||
Proposals for unscheduled releases should be submitted to the [http://lists.laptop.org/listinfo/software-eco "software-eco" mailing list], with a "collector" or summary Trac bug that links to any and all other Trac bugs and a link to a wiki page for collecting '''as much of the following information as is available''' (See [[OLPC SW-ECO 2]] for an example): |
|||
Signing a release requires a majority quorum of: Jim Gettys, Walter Bender, Dennis Gilmore and Ivan Krstic. |
|||
* Title of the Release: descriptive of the major driving force for this release; |
|||
* Trac items: detailed description of the issue; |
|||
* Priority: the believed urgency of the issue, including any deadlines; |
|||
* Root Cause: why did this occur? |
|||
* Effect from the user perspective: How many are affected? What does the user see? how does it affect them? Is there a work-around? What are the consequences of not fixing it? |
|||
* Proposed Fix: the complete source-code and package-level diffs for review, if not clearly stated in referenced Trac bugs; |
|||
* Reviewers: who has reviewed the proposed changes for correctness? (Preferably at least three people competent in the area affected other than the authors of the changes.) |
|||
* Proposed Testing: developer testing, QA testing, multi-language, boot up, upgrade testing, etc. (beyond the usual 1 Hour [[Smoke test]]); |
|||
* Proposed Rollout: Mfg, Support group, G1G1 users, country deployment teams, etc. |
|||
At the time of the initial proposal, a fix (or the "best" fix) may not be known. |
|||
===REVIEW PROPOSAL=== |
|||
In order to ensure that proposal solves problems of a truly urgent nature and in order to properly schedule the resources required to fix the stated problems, the [[#Triage Team|Triage Team]] will carefully and publicly (if possible) review the submitted proposal before committing resources toward its realization. |
|||
The Triage Team will consider the following review criteria as they make their decision: |
|||
* '''TODO''' |
|||
The Triage team will review the proposal, possibly making changes to broaden or limit the scope of the USR, approve specific strategies for addressing the issues necessitating the USR, etc. A member of the Triage team will be named as the Champion of the USR, responsible for shepherding the proposal through the release process. Initial approval of the USR, communicated via email from a member of the Triage team, commits both development and testing resources to work on the USR. Fixes to address the issues necessitating the USR will be developed, and documented on the USR wiki page. When appropriate fixes are approved, the Triage team will communicate the approval via email to software-eco, referencing the specific version of the wiki page approved (again, please CC devel when appropriate). This will start the next step in the process. |
|||
===CANDIDATE BUILD=== |
|||
Upon notification from the triage team, the build master will prepare a build or series of builds as candidates for the USR, updating the USR wiki page with the changed package(s) included in the USR builds, and the rationale for their inclusion. The build master will announce the availability of each candidate build. |
|||
===TEST=== |
|||
When candidate build(s) are made available, bug-specific tests as well as the 1 Hour [[Smoke test]] should be conducted. The results of these tests should be included as part of the USR wiki page. Test results for each build should be announced. Test results may necessitate additional builds, which will require restarting the process from [[#CANDIDATE BUILD|CANDIDATE BUILD]]. If the results of testing necessitate changes to the USR proposal that affect its scope, the process should be restarted from the [[#REVIEW PROPOSAL|proposal review]] step. The [[#TEST|TEST]] step is complete when the QA lead signs off on the candidate build. |
|||
The testing for this build will include both specific and general tests. Once the release master has created a candidate build (unsigned) for testing, testers can go through the test plan. |
|||
====Specific Testing==== |
|||
# The build will be tested using the information contained in the Trac bug(s) to ensure specific fixes. |
|||
====General Testing==== |
|||
# The versions of OFW and the wireless firmware should be checked. |
|||
# The build must be installed on MP and B4 systems successfully. |
|||
# Upgrades from the previous release are successful (See [[Things to test after updates]]). |
|||
# Downgrade to the previous release are successful. |
|||
# A fresh install (as in manufacturing) is successful on both MP and B4 systems; |
|||
# The [[Smoke test]] must be performed, both using a fresh installation and an upgraded installation, looking specifically for regressions from the release reports, and thinking about possible interactions a fix might cause. Fixes to core technologies may require much more extensive testing and soaking, as some failures only occur after time or use; |
|||
# More than one SKU and keyboard type must be used during this testing (both MP and B4, due to differences in manufacturing data) in order to catch regressions in keyboard identification |
|||
# Wireless should be tested against open, WEP, and WPA access points (See [[Wireless testing]]) |
|||
# Any new hardware support must be tested explicitly (e.g. new keyboard type, new revision of a component); |
|||
# When time permits, test builds should be used for testing by developers in the field to confirm the fixes; |
|||
If ''new'' problems (not previously known as part of testing or bug reports) are discovered during the test process, they '''must''' be analyzed to root cause and the software-eco mailing list informed of the findings, and a new decision made on proceeding. These would normally be latent bugs, and the criterion for proceeding (waiving) the bug should be based on a judgment that the candidate build is at a minimum no worse than the previous build. |
|||
===SIGNATURE=== |
|||
When the candidate build has completed testing, a security and appropriateness review of the candidate build is performed. If approvals of each component are obtained, the candidate build is then signed. If the review uncovers problems, the process should be restarted from [[#CANDIDATE BUILD|CANDIDATE BUILD]] with appropriate changes to security-sensitive components. The availability of the signed build is announced. |
|||
Key in any signing decision should be, "does this build, or could this build, compromise antitheft/activation security"? Changes in the firmware, kernel, olpcrd could potentially compromise security; in general other changes are "safer". |
|||
====Signature Checklist==== |
|||
# Security sign-offs on olpcrd, kernel, OFW, and EC? These are central to our security system. FUTURE: an additional audit will be required by the security team, by different individuals than the ones who made the changes. |
|||
# Do the versions of olpcrd, kernel, OFW, and EC match those signed-off? |
|||
# Was the testing of the candidate release successful or was a newly discovered problem waived (per above)? (Whenever practical, the build will have also been tested by a significant number of users in the field to confirm the fixes are correct.) Build to be signed should have a sign-off from the QA lead. |
|||
# FUTURE: Are all source packages are present and accounted for? |
|||
# FUTURE: Are all packages were build on the correct OLPC controlled build system(s)? |
|||
# Are only packages fixing the referenced bug(s) changed by the build that is to be signed? |
|||
Only ''after'' this checklist is complete should the build should the build be signed and then made available in the 'candidate' directory on download.laptop.org for release candidates. |
|||
The build master is responsible for ensuring the checklist and that the process has been followed. |
|||
===Final Steps=== |
|||
Note that signing the build for release is <b>not</b> the end of the process: the signed release must see a final verification step on write protected systems. |
Note that signing the build for release is <b>not</b> the end of the process: the signed release must see a final verification step on write protected systems. |
||
===FINAL TEST=== |
|||
The final steps are to be performed in order: |
|||
When the signed build is made available, final testing is performed to ensure that the signature process was successful and that operation and upgrade on secured machines is correct. The QA lead signs off on the results of final testing. The USR wiki page is reviewed and verified to accurately describe the final build. (In particular, the checklist contains spaces for a final sanity-check of firmware versions.) The [[#Release Team|Release Team]] signs off on the final USR. |
|||
# The signed build needs to be installed on a write-protected laptop via USB stick—fresh install as well as upgrade; |
|||
# The signed build needs to be upgrade from the previous stable build via network update; |
|||
# The signed build must be automatically upgraded from a central server; |
|||
# The signed build can be downgraded to its previous signed build; |
|||
# If a candidate build fails the testing, the candidate must be removed from the candidate directory. |
|||
===RELEASE=== |
|||
When the Release Team and QA lead have signed off on the [[#FINAL TEST|FINAL TEST]] step, the build engineer uploads the signed build to download.laptop.org and updates the appropriate wiki pages to list the new build. An appropriate subset of the USR wiki page should be converted into release notes for the USR. The release engineer then notifies all appropriate/affected parties with information on what they need to do to distribute and apply the release. Note that the targets of a USR may be a small subset of our users. |
|||
The final steps are to be performed after FINAL TEST has been signed off: |
|||
# Move the build from 'candidate' to 'official' on download.laptop.org (removing the candidate entirely); |
|||
# Notify the Quanta ECO mailing list, if indicated, preferably using signed email, and certainly containing checksums of the build and the URL at which the official bits can be found, and if/when the build should be phased into production. An explicit judgment as to whether the build should immediately go into production is required; |
|||
#*Concrete example: if the ECO were to fix problems with a keyboard not currently being produced, disturbing production would be very unwise; |
|||
# Notify the software-eco and devel mailing lists that this new build is available providing the link to the wiki page as release notes (or create a release notes page). |
|||
==Dramatis Personae== |
|||
===Triage Team=== |
|||
[[User:Kimquirk|Kim Quirk]], [[User:Jg|Jim Gettys]], [[User:Wad|John Watlington]], [[User:Gregorio|Greg Smith]] |
|||
===Release Team=== |
|||
[[Profiles/mstone|Michael Stone]], [[User:CScott|C. Scott Ananian]], [[User:Kimquirk|Kim Quirk]], [[User:Jg|Jim Gettys]], [[User:Gregorio|Greg Smith]] |
|||
[[category:ECO]] |
|||
# sign the build, generate known checksums, using the signature procedure. |
|||
# perform the One Hour Smoke Test. |
|||
# test its installation on a write protected MP system. Also test an installation on B4 systems. |
|||
# test to upgrade to the new build from the previous stable build |
|||
# test you can downgrade the new build to the previous stable build |
|||
# test automatic upgrades |
|||
# the build should be made available in the 'candidate' directory on download.laptop.org for release candidates. |
|||
# when Kim signs off on the build, the signed build can be moved from 'candidate' to 'official' on download.laptop.org. |
|||
# Notify the Quanta ECO mailing list, if indicated, preferably using signed email, and certainly containing checksums of the build and the URL at which the official bits can be found. |
Latest revision as of 17:08, 8 February 2011
For current information on OLPC software release processes, see Release Process Home
This page is monitored by the OLPC team.
For Developers
The software ECO process is a process for producing software releases meeting an established quality standard. ECO, or engineering change order, is a term from the hardware world describing the order needed to place a new revision of a product into production. From time to time there may be critical bug fixes that must be released before the next normal, scheduled releases. In order to ensure that the quality of our released software continually increases, the following procedure should be strictly followed when proposing, creating, and publishing an unscheduled software release. For process documentation, a copy of the USR Checklist should be made and completed as each step is completed. Be sure to use the 'Protect' tab on your checklist page. to ensure that edits are done by logged-in users only. Process Overview
Process step detailsPROPOSALAnyone who sees a problem they believe to be critical can initiate the email to the software-eco list at laptop.org to start this discussion (when appropriate, please CC the devel list). This ensures notification of anyone who may need to be involved in this process. The proposal should meet the criteria below, and adhere to the format described in the next section. Proposal Criteria
(There is a discussion of these criteria here.) How to ProposeProposals for unscheduled releases should be submitted to the "software-eco" mailing list, with a "collector" or summary Trac bug that links to any and all other Trac bugs and a link to a wiki page for collecting as much of the following information as is available (See OLPC SW-ECO 2 for an example):
At the time of the initial proposal, a fix (or the "best" fix) may not be known. REVIEW PROPOSALIn order to ensure that proposal solves problems of a truly urgent nature and in order to properly schedule the resources required to fix the stated problems, the Triage Team will carefully and publicly (if possible) review the submitted proposal before committing resources toward its realization. The Triage Team will consider the following review criteria as they make their decision:
The Triage team will review the proposal, possibly making changes to broaden or limit the scope of the USR, approve specific strategies for addressing the issues necessitating the USR, etc. A member of the Triage team will be named as the Champion of the USR, responsible for shepherding the proposal through the release process. Initial approval of the USR, communicated via email from a member of the Triage team, commits both development and testing resources to work on the USR. Fixes to address the issues necessitating the USR will be developed, and documented on the USR wiki page. When appropriate fixes are approved, the Triage team will communicate the approval via email to software-eco, referencing the specific version of the wiki page approved (again, please CC devel when appropriate). This will start the next step in the process. CANDIDATE BUILDUpon notification from the triage team, the build master will prepare a build or series of builds as candidates for the USR, updating the USR wiki page with the changed package(s) included in the USR builds, and the rationale for their inclusion. The build master will announce the availability of each candidate build. TESTWhen candidate build(s) are made available, bug-specific tests as well as the 1 Hour Smoke test should be conducted. The results of these tests should be included as part of the USR wiki page. Test results for each build should be announced. Test results may necessitate additional builds, which will require restarting the process from CANDIDATE BUILD. If the results of testing necessitate changes to the USR proposal that affect its scope, the process should be restarted from the proposal review step. The TEST step is complete when the QA lead signs off on the candidate build. The testing for this build will include both specific and general tests. Once the release master has created a candidate build (unsigned) for testing, testers can go through the test plan. Specific Testing
General Testing
If new problems (not previously known as part of testing or bug reports) are discovered during the test process, they must be analyzed to root cause and the software-eco mailing list informed of the findings, and a new decision made on proceeding. These would normally be latent bugs, and the criterion for proceeding (waiving) the bug should be based on a judgment that the candidate build is at a minimum no worse than the previous build. SIGNATUREWhen the candidate build has completed testing, a security and appropriateness review of the candidate build is performed. If approvals of each component are obtained, the candidate build is then signed. If the review uncovers problems, the process should be restarted from CANDIDATE BUILD with appropriate changes to security-sensitive components. The availability of the signed build is announced. Key in any signing decision should be, "does this build, or could this build, compromise antitheft/activation security"? Changes in the firmware, kernel, olpcrd could potentially compromise security; in general other changes are "safer". Signature Checklist
Only after this checklist is complete should the build should the build be signed and then made available in the 'candidate' directory on download.laptop.org for release candidates. The build master is responsible for ensuring the checklist and that the process has been followed. Note that signing the build for release is not the end of the process: the signed release must see a final verification step on write protected systems. FINAL TESTWhen the signed build is made available, final testing is performed to ensure that the signature process was successful and that operation and upgrade on secured machines is correct. The QA lead signs off on the results of final testing. The USR wiki page is reviewed and verified to accurately describe the final build. (In particular, the checklist contains spaces for a final sanity-check of firmware versions.) The Release Team signs off on the final USR.
RELEASEWhen the Release Team and QA lead have signed off on the FINAL TEST step, the build engineer uploads the signed build to download.laptop.org and updates the appropriate wiki pages to list the new build. An appropriate subset of the USR wiki page should be converted into release notes for the USR. The release engineer then notifies all appropriate/affected parties with information on what they need to do to distribute and apply the release. Note that the targets of a USR may be a small subset of our users. The final steps are to be performed after FINAL TEST has been signed off:
Dramatis PersonaeTriage TeamKim Quirk, Jim Gettys, John Watlington, Greg Smith Release TeamMichael Stone, C. Scott Ananian, Kim Quirk, Jim Gettys, Greg Smith |