Test process sprint for 2007 first software release
We're going to have to do an awful lot of testing in the run-up to the first release of our software to our first deployment countries (milestone: FRS) so we can either keep going as we have so far or we can work harder to find or build the tools needed to make this testing go smoothly.
The purpose of this sprint, currently scheduled for Wednesday, October 17, 2007, is to jump-start the "planning, finding, building, and using" strategy.
The goal is really to come up with a workable plan that describes how we want to act in the run-up to FRS and a bit beyond. Roadmap. The other main goal is to have a good time, so that we can run the event again, with even more people in attendance!
From people with a QA background, we would like 1) your advice and 2) descriptions of what you need/want in order to effectively help us.
We plan to start at 2:00 PM EST, through the evening, 6:30 or 7. Interesting people will be dropping in and out. We will also be on IRC channel #olpc-meeting.
- Making a "Test Activity" that we can all use to manually execute the test plans stored on the wiki
- Teaching interested people how our Tinderbox works so that you can contribute automated tests.
- Playing with ways to get better feedback out of activities (i.e. special, machine-readable logging, use of accesibility extentions for scripting, embedding small interpreters, etc)
- Figuring out how to centralize the collection and visualization of the reports generated by the "Test Activity"
- Going through all of the old test plans, trying them out, and fixing up the parts that no longer apply
- Working out the exact written instructions, from start to finish, for running a single activity under emulation (i.e., installing an emulator, importing a bundle, running the bundle -- including networking or mic/camera access).
- Can idle laptops be used in some way?
- Has anyone looked into tools for doing X GUI scripting and snapshots tests?
- Do we know what the threat environment looks like? And what the testing mission/objectives are? Catch last minute hardware bugs? Help core activity writers shake down their apps? Detect late regressions? Support ofw and kernel development? Determine the top ten things which will make users happier, be they bug fixes or new features or stalled apps? Avoid "problems which would be embarrassing"? What is the relative importance of trials and xogiving? Create infrastructure for FRS? For the next 6 months? Year? Whatever mix gives the biggest bang, in whatever direction, for our highly constrained time and resources? Especially given the constraints, depending on what the goal(s) are, it seems they send us in different directions.
- What exactly are the time and resource constraints?
- What are some innovative brainstormy ideas for breaking out of them?
- Ubuntu Live-CD + 1 GB USB stick = emululated xo. Camera and TamTam sound don't work. Some other issues. But $300 might make a school computer lab into an xo cluster. We could spin up classes exercising, say, Write's collaboration, and reporting problems.
- Create testing jabber/school server? The barrier to set up collaboration is currently rather high. Creating a "everyone here is pretending to be part of the same school, and is here to collaborate and test things" might make it easier.
- Ubuntu 7.10 is coming out. Can we piggy back on that attention to get testers?
- Create a .deb package so running an emulated xo is simply apt-get olpc-xo-emulation; olpc-xo-emulation --run ?
- Can trial feedback be made more accessible? If lots of people are asking for X, that should perhaps be part of the test picture.
- How can we get more unscripted testing? To detect things like 'small child carrying laptop will close it when moving from place to place. even in the midst of taking photos. so a closing-breaks-Record bug is more of an issue than one might expect from adult use'.
- There are a number of B4's or better in Boston (like mine). Perhaps set up a system so anyone associated with the project, or otherwise a known quantity, who has kids, can sign up to take a laptop home for a day, with the commitment to have the kids bang on it, observe, and report back results.
- Can we get a volunteer to sit with a laptop in MIT Lobby 10 or student center, or outside Park Street T, or better, somewhere where people are standing around waiting for something, let people play, and take notes. Actually, two volunteers would work out better, as it's very difficult to both answer questions for the croud, and help and observe the person using the laptop, at the same time.
- Do any of the local colleges or extension schools (or non-local) have a QA class? Could we volunteer emulators to be guineapigs?
- How high profile do we want our need for testing to be? Eg, given xogiving. Slashdot?
- We need fresh eyes. People who haven't learned to shy away from closing the laptop. Who go to the Library grid and say, neat, err, what do I do with this .xol file?
- Can we get a couple of xo's meeting in a bar once a week? From past experience, that sucks half the bar over for demos. Which when they are MIT journalism and development majors, an Igbo speaker, and programmers, might be useful in itself. But it's also a chance to find collaboration issues like one laptop can't mesh to another connected to an AP. Some xo's need to meet under some trees, even if they are only in Kendall Sq.
- Can we clone Mike Fletchers "there is a coffee house in Toronto where an xo lives for play"? But with observation, or a feedback book. We should ask Mike to add a feedback/bugreporting book if he hasn't already.
- If we do this again, perhaps we might do it in the evening, to more easily get working QA folks to come.
- Ask the xogiving folks to say "as part of your getting this early copy of the laptop, we hope that you will (download and) run the test activity, to help us find bugs in preparation and support of schools getting them". This could be valuable for expectation management, putting people on "our side" of bugs, rather than their being between us.
Ideas/notes from phone discussion: Michael, Grig, Titus, Kim
We discussed the areas of the XO that are unique or new -- generally these indicate the areas of highest risk:
- Mesh/connectivity (difficult to test in emulation)
- Power mgmt/battery life (can't test in emulation)
- Journal/Sugar UI (can be tested in emulation)
- Security/containerization (can be tested in emulation)
Other areas of test:
- Activities (most can be tested in emulation; should be easy and encouraged to test in the community)
- Performance, Journal/Sugar (can be tested in emulation)
- Scalability, (cannot be tested in emulation)
- GUI testing is always difficult with specialty hardware; need to weigh cost to automate this versus manual testing
- May need to dedicate a core developer to head up the test harness development
- Should figure out a way to zip up logs from all activities and send to a central server; alternatively, piggyback on the syslog mechanisms and log events as they happen to a central syslog server
- Some testing requires accelerated time - come up with good ways to do that; for example by mocking the system time, or having dials into various components such as garbage collection which would allow us to say 'don't garbage collect now'
- I briefly explored getting slower-than-rt time, to avoid timeouts in non-kqemu qemu. The qemu folks said altering the relationship between real and simulated time would be a trivial patch in one place. MitchellNCharity 13:31, 18 October 2007 (EDT)
- Need to create smoke/automated tests specifically to exercise APIs; these tests should be run inside a test framework such as nose, and should be included in Tinderbox
- We should consider 'record and playback' type testing; record user interactions, play them back, see if anything changed and log changes
- Get some code coverage tools: figleaf springs to mind
- We should plan an 'emissaries' mission - bring 50 laptops to a geek group (like the Python interest group) and get specific developer feedback; help in specific areas.
- Grig: As a general strategy for writing automated tests, I recommend starting with areas that hurt, and new bugs; try to write an automated test for each new bug; in time, go back to other current bugs and write tests for them
- Grig: for integration testing, one way to go would be to use pyvix for driving the XO virtual images in vmware
- Grig: here is a link to desktop GUI testing tools: http://pycheesecake.org/wiki/PythonTestingToolsTaxonomy#GUITestingTools
- Grig: as a social aspect, ask for testing help from the community via mailing lists and blogs (Titus and I can blog about it if you're OK with it); make it easy to join and contribute; maybe we should wait for something to be put in place in terms of automated testing before we go gung-ho with it though
- Community tinderboxing. Lots of things seem to be bottlenecked on tinderbox testing. Why can't any laptop serve as a tinderbox? Connect to net, download test-mumble script, it runs, it emails its results back home. If a rom/build replacement is needed, for folks with "don't care about contents" xo's (eg, I frequently wipe mine anyway), perhaps insert usb with boot/ to return to afterwards, run script, etc. http://smoke.pugscode.org is one example of community tinderboxing.
- Across-the-street school trials. We should be able to dig up 3+ boston-area people with B4's. So we take a school server, go across the street to MIT, and set up a "school". A couple of hours, and we get some not-in-1cc ground truthing. A laptop with MIT net (even guest) access might serve as a school server? Lots of possibilities for exploring networking issues. Could do a weekly 'meet for lunch, or after work' 1 hr road-trip / fire-drill. The school server needn't be stationary, could walk/wander along too. Intermittent server internet access would actually be a plus. Wander by anyplace with lots of people, to/from/during testing, for some great pr (and "oh, we never tried it that way" test envelope expansion). Set them all up as spanish, and go somewhere with spanish speakers. This seems an easy way to do some ground-truth drills with little or no drain on 1cc personnel.
Leave your name below if you are interested in participating. Note what you want to do, whether you'll be at 1CC in person (for ordering food!), and what medium you're going to be working in. If you see someone you want to test with, you can contact them ahead of time to coordinate.
You might also watch this page so that you stay up to date on how the sprint is developing.
There are tentative plans for a shadow meeting on irc.
And if we can find someone running an iceserver, perhaps we could stream video of the meeting from an xo. So remote folks can follow non-irc discussion, see any whiteboard sketches, etc.
Also, if you're planning on joining us in person, feel free to contact Sj at 617-529-4266 for directions.
How to help with going through Test plans
- Install the latest unstable build Autoreinstallation_image
- Go to the Testing page
- Choose a Test to review, and put your name next in the last column, so that others will not update the same test plan.
- Go through the test plan, changing the Actions and things to Verify, so that they match up with the current activity.
- This is because many of the activities have changed since the test plans were written.
- If there are new features in the activity, make a new test case for those features.
- Add a results link for the test plan. Use Results/News Reader as an example.
- Basically, the sample template for the results is just a check list of all the verifications in the test plan.
- Go through your test, and add your results to the results page.
- Also, Feel free to add a test plan that you believe should exist.
- Michael Stone 19:57, 11 October 2007 (EDT) | Userland security & incremental update developer
- m_stone on #olpc, #sugar; michael (at) laptop.org; Interested in Tinderbox and "Test Activity"
- Chris Ball - Tinderbox maintainer, cjb on IRC and at laptop.org
- Sj talk (on IRC)
- Joel Stanley - working as shenki on IRC, joel (at) laptop.org. Working from Adelaide, Australia (GMT+9:30).
- Reynaldo Verdejo reynaldo on #olpc, #sugar and @opendot.cl; Interested in the Tinderbox (on IRC) [UTC/GMT-4]
- MitchellNCharity 20:56, 11 October 2007 (EDT) - Boston-area volunteer. mncharity on IRC and vendian org.
- Shankar (on IRC and wiki)
- Rowen on IRC, #olpc, #olpcph - OLPCPH Project Manager, IT Engineer, needs an extra machine to do mesh testing, "Test Activity", Tinderbox, emulation, porting (Tiny C / tcc has been ported to XO) and others
- RafaelOrtiz Interested in Test Activity, Tinderbox and documenting the process.
- Kim Quirk Planning, execution, Test Actvity
- Alex Latham
- Jessica Baumgart