Community testing meetings/2008-11-06/Prioritizing activities to test

From OLPC
Jump to: navigation, search

From Cjl

There is some distinction between building the tools for enabling meaningful community testing and efforts to build *community* for community testing, but they are inter-related efforts and done well, these are mutually reinforcing.

Test Activities we ship first

I'm not 100% sure that a whole lot of time thinking it through will yield significantly better results than taking list of activities in images used for large deployments (including both countries and G1G1) and ordering those by which activity author steps up first. Ideally all activities hosted on dev get tested at some point, I don't know that there is a lot of meaningful distinction in which order they get their "first testing".

After a first round, events that would merit a second round of testing would include significant version change (e.g. rebasing on a changed upstream element abiword, xulrunner, etc.), significant build change or new inclusion by a significant deployment e.g. (Conozco uruguay). Ideally community testing is something tha a new user can just jump into solo (after the hard work of building test cases and infrastructure is done), but there are distinct advantages of scheduled joint cooperative activity on that inital round of test case creation and testing. Activity author involvement seems critical to me during this phase.

Gathering user data

If there is indeed some interest in gathering real data about usage from users, I do not think any covert means could or should be employed, on the otherhand, there is theoretically a "Poll Builder" activity that could possibly be leveraged to this end. A "customer survey" using it would also serve the purpose of providing a demonstrable use case for this activity to users.

Classes and weightings

Another method of prioritizing (for first round) would be to define certain classes and weightings and assign semiquantitative scores. I will assign weightings somewhat arbitrarily for discussion purposes.Start with a semi-arbitrary scoring scheme and adjust intutively until it gives sensible ranking. Add or delete classes as needed to a list like this, don't get hung up on the numbers themselves, their only meaning in roughly assigning weighting to factors that would otherwise be entirely subjective.

Classes

Deployment (0-30) Higher score for use by larger (or multiple) deployments (include G1G1 as single deployment)

Target user sophistication (0-20) Higher score for less sophisticated (younger) user on basis that they may be less fault-tolerant

Educational focus (0-20) Higher score for more educational activities

Activity maturity (0-10) Higher score for higher version number, reward authors who produce revisions

Localized (+5) Increase weighting for activities with translations

Brand new activity bonus (+5) New activities (version 1) probably need scrutiny

Sharing bonus (+5) Reward author effort to leverage sharing/collaboration features

Upstream bonus (+1) Increase weighting for activities leveraging upstream development

Hardware (+1) Increase weighting for activities taking advantage of XO hardware features

Fudge factor (0-3)

From 1cc's internal QA meeting on 2008-11-05

There were some differing opinions on this from OLPC's internal QA staff - it was agreed, however, that the decision was totally up to the community test group.

  • Volunteers should test activities we don't test at 1cc. We test the behavior of Activities in a particular system situation.
  • Volunteers should focus on testing activities that OLPC directly works on because we consider it core functionality, like Browse and Write, because we have more direct contact with the developers that make them. Browse, Write, Record, and Chat.
  • Connect to jabber.laptop.org [so you can test Activity behavior independent of the mesh network] because it tests both collaboration and the XS.
  • People with one laptop can create test cases and help write up the 'documentation' of how an activity works.

From Sameer Verma

Here it is. It saw a bunch of traffic on one of the lists...some criticism, some "whaaa..." type responses. I did have a good chat with Walter about it though.

http://spreadsheets.google.com/ccc?key=p_Xhb6KcXLyEViA50CnCaDg&hl=en

Let me know what you think. I think moving in a direction of weighted scoring is generally good because it forces us to get a more stratified input as opposed to "Yay!"