Community testing meetings/2008-11-06/Displaying testing metrics in motivating ways

From OLPC
< Community testing meetings‎ | 2008-11-06
Revision as of 20:38, 17 February 2009 by Skierpage (talk | contribs) (also Litmus)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Format of test data

The exact form of Activity test results is not entirely clear at present. Here are some data fields we may want to gather (fields we definitely want to gather are in bold):

  • Activity tested
    • Test case executed
    • Activity version #/build
    • Sugar version #/build (if applicable)
    • Operating system
    • hardware type/version (is it an XO? emulator? something else?)
  • Who tested it
    • Tester contact information
    • Name (for giving credit for work)
    • Other work done by this contributor
  • pass/fail
    • associated bugs (in Trac)
    • free-text comments
  • Miscellaneous
    • date of test execution
    • How long it took to run

An activity may have multiple test cases, but there is an expressed preference for short 5-10 minute scripts for test cases to a) improve chances someone will bother to run the case, and b) reduce chances that problems encountered during testing in fact are related to other factors that may not be reproducible (aim for high test-retest reliability of the test itself).

Possible tools

Trac

Of course trac exists, but this is developer-focused, not general public oriented and it doesn't collect statistically useful numbers of results and usually only negative information, only positive information on a resolution (ticket closure). If, hypothetically, activity works as designed, every time, there is unlikely to be a trac ticket on it.

Trac would be much more useful if links to wiki.laptop.org in it worked as expected. Putting [[Community testing]] in a ticket leads to a Trac macro warning]. -- skierpage 22:57, 16 November 2008 (UTC)

Semantic MediaWiki

One option is to use Semantic MediaWiki (SMW) to gather and display test data.

Pros:

  • Some OLPC wiki users are already comfortable with SMW.
  • Very public facing/public modifiable
  • tight link to test cases in wiki, looser coupling to bugs in Trac
  • provides some affordance for the semi-automated summarization and presentation of test results in multiple forms
  • ...without the need of a custom database app
  • ...which offers the possibility of using (and reusing) such metrics to satisfy the interests of multiple stakeholders

Cons:

  • Not necessarily obvious to use - there is a learning curve
  • Slower and clumsier to use than a tool created specifically for this purpose
  • To get useful metrics, every test result would have to be a separate wiki page.
Existing implementation
Semantic Forms are already in use to gather test results — if you visit a test case you can see all its test results that testers have added with the Results [Add Another] button on the test case form. But you can't get useful metrics with the current organization beyond "show test cases that have a bug against them". Semantic MediaWiki#Issues for test results explains the issue in more detail; for example, Tests/Activity/Write/Public sharing has both pass and fail, but a query can't show which build passed and which failed. -- skierpage 21:26, 16 November 2008 (UTC)

Spreadsheet in Google Docs

S Page suggested this around September to someone at OLPC. Test results are

  • small records
  • with lots of repeating info (Build, Test case, Tester)
  • that are hard to model in a simple database

That's exactly what a spreadsheet is good at. OLPC can create a Google Docs spreadsheet and let people add rows to it. Google Docs even has a form interface for adding a result. The spreadsheet is ideally suited to analysis, export, resorting, filtering, etc. Unlike database records and wiki pages, it's easy to cut/copy/paste dozens of records around.

Implementation

S Page made a spreadsheet with all the required fields in Google Docs and actual data from existing tests (had to use [history] tab to figure out when tests were run).

The spreadsheet below can also be viewed at http://spreadsheets.google.com/pub?key=pw29kBR10gwTKfMq18XvNVw

<gspread>pw29kBR10gwTKfMq18XvNVw||300||850</gspread>

It can be edited two ways:

The easiest way to add a test result is to copy most of an existing row and paste it into a new row.

Now to figure out how to make a form and whether I can prefill with username and date.

Notes

Currently anyone can edit. A spreadsheet can be restricted to

  • Edit by invitation only
  • Require signing in

Since all test cases start with the Tests/Activity Name/ , maybe having "Activity tested" as well as "Test case executed" is redundant.

The only formula I'm using is HYPERLINK to make links, e.g.

=HYPERLINK("http://wiki.laptop.org/go/Tests/Activity/Clock", "Clock")
=HYPERLINK("http://dev.laptop.org/ticket/8581", 8581)

Form front-end to Google Docs

Given a spreadsheet in Google Docs, you can make a web form in Google Docs that adds one line to the spreadsheet.

Carl Klitscher created such a form, http://spreadsheets.google.com/viewform?key=pyBIsSK_3IlsHBpwk1EFNcQ , and the other Wellington XO testers used it in December 2008.

Litmus?

https://litmus.mozilla.org manages test cases and test results.

Stakeholders

OLPC management

Very few activities will truly merit a direct OLPC involvement or maintenance (Journal, Terminal, a few can be classified by inclusion in Fructose), while at the same time, a "bad activity user experience" will nonetheless reflect on OLPC, and so there is an OLPC interest in supporting quality mechanisms for activities.

Interests:

  • Being able to recommend (to large deployment customers) Activities, which are inherently developed out-of-house. They need to know which Activities they can stake their reputation on.
  • In order to do this, having some record of testing builds confidence that an Activity (1) will work, and (2) has people working on it that are easy to contact in case a problem comes up.

It is also notable that OLPC management may be able to contribute resources to the Activity testing effort, particularly Activity testing on XOs.

Activity authors

These metrics could be an additional mechanism for feedback.

Interests:

  • Knowing that their Activity is getting used. They're doing this to help kids, after all.
  • Knowing how they can improve their Activity
  • Recognition - always a motivator.

Consider the motivational aspects of a little badge on the activity's wiki page saying 75 passes, no fails, this badge to be automatically updated by SMW tabulation of results as they are submitted. Authors become more engaged in recruiting additional testing for their activities and in addressing community reported failures. We're not talking UA or Good Housekeeping, simple awareness-raising of testing (and the need for community input) may be it's most important feature.

Activity testers

From the point-of-view of activity testers it is essential for there to be a mechanism for recording the results of their testing. The metrics could be used motivationally as a means of recording their service to the community, consider how Wikipedians proudly self-report metrics on the number of edits or other community supporting activities they take part in, similarly for ticket tracking systems (RT or trac), metrics are motivational as a measure of contribution. Running a series of test cases is a means of collecting OLPC merit badges / karma for themselves, as their submission of their test results are tabulated to their credit. Much of testing will still be "itch-scratching" of most popular activities, but this would provide incentive for *recording* those results and maybe for trying a few others to scratch the community recognition itch.

Activity users

From the point-of-view of potential activity users, having some confidence (despite the usual disclaimers) that downloading and trying an activity will not be disappointing, or worse, disruptive.