Activity testing automation

From OLPC
Revision as of 22:25, 12 November 2008 by Mchua (talk | contribs) (New page: == Transcript == The transcript this page is taken from is shown below. <pre> <mchua> dah, sorry. getting water took longer than I thought. <mchua> Ok, so brainstorm time... finding our ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Transcript

The transcript this page is taken from is shown below.

<mchua> dah, sorry. getting water took longer than I thought.
<mchua> Ok, so brainstorm time... finding our charter... one sec...
* mchua notes that others are welcome to participate as well
* robertofaga (n=robertof@201-43-56-79.dsl.telesp.net.br) has joined #olpc
* aa has quit (Read error: 110 (Connection timed out))
<mchua> Dream up automation designs to present to group next week. "I'm a
tester. I want to automate this boring thing. What is my ideal interface
to do so / the most beautiful tool I could imagine for it?"
<garycmartin> mchua: sugarbot
<mchua> garycmartin: what makes sugarbot so beautiful?
<garycmartin> mchua: I've emailed zach to see if it runs on XOs, but the
downside seems to be it's been more degigned as an infrastructure tool
(jubuild and buildbot).
<mchua> http://code.google.com/p/sugarbot/
<mchua> http://code.google.com/p/sugarbot/wiki/HowDoesSugarbotWork has
a usage example
<mchua> qualities of sugarbot that I really like:
<garycmartin> mchua: we write test cases for a bunch of the bugs found
as sugarbot scripts. That makes us define them well, and test when fixed.
<mchua> * it's a completely separate thing from the Activity code itself,
and from Sugar
<mchua> garycmartin: in my ideal world, a bug report would not go to
developers for fixin' without having a test attached that fails due to
the bug, and will pass when the bug is resolved.
<mchua> (in my very, very ideal world.)
<garycmartin> mchua: We then just need to focus on collecting natural
language type reports and turning them into scripts, that over time
hopefully cover a good chunk of test issues/cases.
<mchua> garycmartin: agreed! what kinds of things and tools - and
qualities/features of those things and tools - do we want?
* dsaxena (n=dsaxena@c-67-160-162-157.hsd1.or.comcast.net) has joined
#olpc
<garycmartin> mchua: Your world is truly a perfect sphere ;-)
<mchua> garycmartin, adricnet: do you want to just take 10 minutes and
type out wishlists as fast as possible, and then review?
<mchua> the brainstorming "quality < quantity" thing
<mchua> garycmartin: yeah, I know. :) I like to start with a platonic
ideal, and then figure out how close to it I can get.
<adricnet> Maybe, I was looking for a linka nd got pulled away. If the
levers and button in Sugarbot are solid enough it sound pretty ideal..
* ivazquez1 (n=ivazquez@fedora/ignacio) has joined #olpc
<garycmartin> mchua: usually ends up 'oblate spheroid' when all said
and done :-)
<mchua> adricnet: I think Sugarbot is one possible (and very attractive)
option for Activity test automation - the fact that I can't figure out
how to generate and run a test with it in <15min, starting cold, means
that there's work to be done yet
* vpovirk` (n=urk@c-76-17-237-120.hsd1.mn.comcast.net) has joined #olpc
* vpovirk has quit (Read error: 113 (No route to host))
<mchua> I'll start spewing out my wishes, with a 10min timer set (I may
run out before them)
<mchua> these are wishes for an Activity testing framework/procedure,
and will get pulled into sanity-land later ;)
<mchua> * less than 1hr setup required for testers
<mchua> * less than 1hr setup required for developers
<garycmartin> mchua: sugarbot is better as a centraly admined VM or box
that we would post scripts to.
<mchua> * new Activities made compatible with testing tool also in < 1hr
<mchua> garycmartin: like [[Tinderbox]] in hypothetical?
<garycmartin> mchua: yep
<mchua> * central repository for tests, separate from (but clearly linked
to) the repo for that Activity
* dwmw2 is now known as dwmw2_gone
* J5 has quit (Read error: 110 (Connection timed out))
<mchua> * process for translating natural language bug reports into
Activity test scripts
<mchua> * all bugs submitted to developers to fix have such test scripts
attached (test-driven development)
<adricnet> this suggests that the tests should live in the projects
source repo, as $deity intended ..
<mchua> * all automated scripts for an Activity (from fixed bugs or
otherwise) are automatically run against new builds of that Activity,
with a pass/fail display conveyed to the maintainer immediately
<adricnet> I'd like the Sugarbot interface for users to be about like
etoys or turtleart.
* ivazquez has quit (Operation timed out)
* vpovirk` has quit (Remote closed the connection)
<mchua> adricnet: awesome ideas, keep going!
<mchua> garycmartin: I'm trying to rephrase yours into feature
requests (which may or may not already be implemented in
sugarbot/tinderbox/otherwise), but I'm sure you have a lot more good ideas
* ivazquez1 is now known as ivazquez
<mchua> * should be usable by the average 10-year-old
<mchua> (to report the first pass of the Activity bug, not necessarily
to finish the automation script)
<garycmartin> mchua: sugarbot works through X as far as I can tell,
so all activities 'should' be drivable without modification
<mchua> * easily extensible - Activity test automation scripts should be
in a format that's completely specified and published, so that automated
tools could also later generate said test scripts
<adricnet> openqa's selenium and mozilla's litmus
<adricnet> sugarbot needs to have XML output like Selenium seems to..
* vpovirk (n=urk@c-76-17-237-120.hsd1.mn.comcast.net) has joined #olpc
<mchua> garycmartin: it doesn't work through X, iirc; it uses XML-RPC
which calls the functions of the Activity itself (not through the GUI,
I believe)
<garycmartin> * tests need to be aimed at correct releases (do we keep
old tests for supporting old deployments not yet upgraded?)
<mchua> garycmartin: I'd need to read the code more to confirm that
too, though
<mchua> * tests need to be available for old releases for historical
purposes
<garycmartin> mchua: me too by the sounds of it :-)
<adricnet> tests need to stay in repor for effectively forever.
<mchua> * tests should be separately downloadable and locally runnable
outside of the centralized tinderbox-ish setup so that those who wish
to experiment with local changes before pushing to central repo can
<adricnet> although testsuites can be targeted to releases or familys
(test these for 8.2.x)
<adricnet> mchua:  ++
<morgs> I'd love to see us (developers) doing true test driven development
by having test frameworks in the code itself...
<mchua> oh! also, test framework should be open-source...
<adricnet> The harness has to run on a single XO as well as on our
virtual mehemoth
<mchua> i.e. *no* proprietary anything should be required to do testing
or development for an Activity
<adricnet> pre-depends OSS/Free software and data, eyah
<garycmartin> morgs: is that a war I can hear starting... ;-)
<mchua> morgs: <3
<adricnet> morgs: They opt to have the option. Later, we break out
the stick.
<morgs> It has to be a really big stick :)
<mchua> * guidelines on how devs can make their code "more testable"
(this sentence full of vagaries, not sure how/who can do this)
<garycmartin> morgs: (test driven developent can be very slow, painful,
and boring for devs)
<mchua> ding ding ding! my timer says 10 minutes, do we want to keep
spewing ideas for another 5? we have 40 minutes left in this brainstorm
<morgs> garycmartin: yes. So can weeks of lost time tracking down
unnoticed regressions :)
<garycmartin> morgs: (well get even less devs scratching that itch...)
<adricnet> different testsuites for full integration and for each tiny
dev cycle ..
* morgs isn't ready to scratch that itch, so it will lie until somebody
actually makes it happen
<adricnet> mchua: Seems like we ahve enough to start arguing about
* vpovirk has quit (Remote closed the connection)
<mchua> * developers should be able to opt-out of having their code
tested, *but* know that all shipped-with-XOs code *must* be tested
<adricnet> Errr .. g'luck on that one..
<mchua> adricnet: yeah, it's a wishlist :)
<mchua> for that matter
<mchua> * a ponoy
<mchua> er,
<mchua> * pony
<adricnet> Ponies for everyone!
<adricnet> And cash. Yay cash!
* GoatCheezWork (n=Miranda@rrcs-97-76-61-66.se.biz.rr.com) has joined
#olpc
<garycmartin> morgs: I know, swings and roundabouts. Do you like a
life of grey coding bordom, or 2 months of joy followed by 2 months of
hell. Some where in the middle is a likely sweet spot :-)
<mchua> adricnet, garycmartin, morgs - I'm gonna clean this up into a list
of ideas and pastebin so it's easier to read, can you guys elucidate on
the current state of what it's like to test Activities from the tester/dev
perspective in the meantime? (should take me 5min or less)
<mchua> (particularly interesting: "My god, it's full of pain!" areas,
and "I love this part of the process" things.)
* vpovirk (n=urk@c-76-17-237-120.hsd1.mn.comcast.net) has joined #olpc
<adricnet> hmm .. some want to name a victim activity
<garycmartin> mchua: with my Moon dev hat on, I test it all through
before each release, but that's because it's simple and I can use all
the possible inputs.
<adricnet> For Capture there's trying to aim the screen at the cat..
<kevix> mel, is this wishlist going on w.l.o or somewhere else?
<garycmartin> mchua: I make sure I both resume and clean start through
all it's view modes.
<garycmartin> mchua: I make sure I've tested in the primary languages
to make sure translation sctrings all come through.
<mchua> kevix: yes - at the end of this we should decide where everything
is getting published
dogmeat danjared dbagnall_ dgilmore dirkx dmead dsaxena dwmw2_gone
* vpovirk has quit (Client Quit)
<mchua> (aside from on the testing mailing list - kevix, are you
subscribed?)
<garycmartin> mchua: I leave it running for long durations watching both
memory and cpu (for leeks or hogging).
* mchua adds * should catch memory leaks and * should test i18n to
wishlist
<kevix> no. I get the mails from the bug list. not from testing-*-*
<adricnet> Lol. *wishes really hard*
<garycmartin> mchua: I look at other sources of information to make sure
it's not telling me fibs.
<garycmartin> mchua: Oh darn, I'm beeping. Sorry all have to go now
(hard stop for me).
* vpovirk (n=urk@c-76-17-237-120.hsd1.mn.comcast.net) has joined #olpc
<kevix> bye, gary.
<garycmartin> kevix: bye!
* garycmartin has quit (Remote closed the connection)
<mchua> thanks, garycmartin!
<mchua> dah, too late
<mchua> pastebin!
<mchua> kevix, adricnet: http://pastebin.ca/1254589
<mchua> referring to these by line number... which do you think are (1)
highest priority, (2) unrealistic, and (3) already done?
<kevix> snarking the URL
<adricnet> If someone could tell me the state of Sugarbot today ...
* rgs_ (n=rgs@190.128.250.238) has joined #olpc
<mchua> My 'highest priority' list: 27, 18, 5, 11, 8
<mchua> adricnet: http://code.google.com/p/sugarbot/ is all we have,
unless we can track down the project owners (no luck so far)
<mchua> adricnet: it looks like no work has been done on it since
september 8, 2008
<mchua> my 'unrealistic' list: 22 (in that we shouldn't write test
scripts for old and no-longer-relevant releases; that's a lot of burden
- we should keep all the scripts we write, though, so when the current
builds become no-longer-relevant they'll still have their associated
tests with them), and 31-22 are... probably... yeah.
<mchua> I don't think any of these are completely implemented, to my
knowledge (where "implemented" == "in widespread use by the olpc dev/test
community") but [[Tinderbox]] and [[Sugarbot]] have some elements of
these requests in them
<adricnet> What does (a) OLPC Tinderbox setup test for?
<adricnet> Mozilla tiderbox is more useful when there is compliation?
<mchua> adricnet: OLPC-tinderbox is afaik currently not maintained
<mchua> adricnet: the Big Cool Thing it does is that it takes hw
measurements as well
<adricnet> mchua: Roger. do we know what it did?
<adricnet> "hw measurements" ?
<mchua> (there's an XO at 1cc that has tiny voltage probes sticking out
of it for measuring stuff like power consumption for the different builds)
<mchua> adricnet: Not... very well. I mean, we can always read the
code. http://dev.laptop.org/git/projects/tinderbox
* bjordan (n=bjordan@cpc2-hitc2-0-0-cust908.lutn.cable.ntl.com) has
joined #olpc
<mchua> adricnet: this is something I'm supposed to clean up and work on
<adricnet> High voltage!
<adricnet> sorry, random Dolby moment.
<kevix> so activities can be tested by the developers framework and by
an OLPC framework
<mchua> adricnet: ...ostensibly after the g1g1 crazy washes over
<mchua> kevix: what would the difference between the two be?
* robertofaga has quit (Read error: 60 (Operation timed out))
<adricnet> mchua: Sure. Afaik, tbox would be great for does ita ll still
build, but I dunno if it does functional testing
* ctyler has quit ("returning to Spare Oom")
<adricnet> Well, there's unit tests, functional tests, and QA ... they
should all at least share some trade languages
<adricnet> I _think_ the comm testting for activities is going to be QA,
which hopefully will come up with some unit/functional tests to feed
the devs..
* ctyler (n=chris@global.proximity.on.ca) has joined #olpc
<mchua> adricnet: tbox does really, really minimal func testing
<mchua> adricnet: is my understanding
<mchua> adricnet: maybe a better way of putting it... *rummages for words*
<morgs> tinderbox seems to basically test that things boot up and
start up.
<mchua> adricnet: "Whenever a new build goes out, tinderbox runs a certain
(python) script to be automatically run on an individual XO that's hooked
up in 1cc."
<mchua> This script (afaik) currently tests whether the build loads,
whether Activities start, and also logs some power measurements
(and... possibly makes sure those measurements fall within a certain
numerical range.)
<adricnet> Ah, kk
<adricnet> That about syncs up with what I was thinking, cool. And this
will need to be ressurrected, later. Cool.
* morgs -> $HOME
<mchua> The script currently running on 1cc-tinderbox can be modified. It
is also possible (but difficult, right now) for others to set up their
own tinderbox machines, and run the same (or different) scripts on them.
<mchua> adricnet: Yeah, and I don't yet have a good view of the scope
of work required for that resurrection.
<mchua> (Alas.)
<mchua> adricnet: since we have 12 min to wrap up, how does this sound,
in order of implementation?
<mchua> 1) make a python library for testing, which can be run externally
against an Activity and return a True/False value as to whether it passed
(possibly/probably resurrecting sugarbot code or design)
<mchua> (this would be separate python files, and import
sugarbot-or-something-like-it, and import the Activity as specified in
some filename in the test-python-file code, and run, and return True
or False.)
<adricnet> Right. Might want to start with Py Test::Unit stuffs
<mchua> Yep.
<mchua> So for a tester or developer, the procedure to run a test would
look like this
<mchua> * download foo_test.py
<mchua> * open Foo.Activity folder
<marcopg> I missed all of this meeting...
<mchua> * throw foo_test.py into Foo.Activity/tests
<mchua> * python foo_test.py
<mchua> * observe results
<marcopg> something that I would like to see, is these scripts to not
be tinderbox specific
<marcopg> I'd like to run them also on SL buildbot
<marcopg> (sorry to interrupt your attempt to summarize!)
<mchua> marcopg: not at all!
<adricnet> marcopg: we're dreaming up a harness interface that should
run everwhere, yes :)
<marcopg> adricnet: great!
<mchua> marcopg: ooh, I should ask you about buildbot in about 6
minutes ;)
<marcopg> :)
<mchua> adricnet: after that, 2) would be "make a central repository
for such tests"
<adricnet> that's tricky .. but yeah they have to be kept somewhere
<mchua> adricnet: and then 3) automate the running of all the tests in
(2) on an XO...somewhere... maintained by somebody....
<marcopg> central repository or per activity?
<mchua> adricnet: and then 4) make that "somewhere, maintained by
somebody" XO able to be set up by anybody, anywhere (this probably is the
"resurrect tinderbox" part)
<marcopg> that's something we discussed with zach and I'm not sure what
is better
<adricnet> well yes on three where someone is 1..x people and somewhere
is 1..x XO
<adricnet> Yeah, the virtual thingy ..
<mchua> marcopg: btw, do you know how to get in touch with zach? (or
titus, or grig?) They probably have figured much of this out already
<adricnet> marcopg: It's up for argument .. should these QA level tests
be in repo with the software or all live together somewhere?
<marcopg> mchua: you mean other than sending them mail? ;)
<mchua> adricnet: maybe another way of rephrasing that question is
"when you download an Activity, should that also download the tests for
that Activity?"
<mchua> marcopg: yeah.
<marcopg> adricnet: right, I don't have an answer unfortunately :)
<adricnet> Need to clarify our terms, but yes
<marcopg> mchua: nope, but I sent them mail and they have been responsive
usually
* shenki has quit (Read error: 104 (Connection reset by peer))
<mchua> marcopg: ah ok, I'll try again. maybe it's my mail acting up
(it has been, lately. I'm not sure why.)
<marcopg> if you post about the plans somewhere
<mchua> adricnet: whoo. almost at time for today - anything else?
<marcopg> I can have a look too
<adricnet> l.o mail was down yesterday?
<marcopg> I thought a *little* bit about the issues already
<marcopg> and worked some on infrastructure (buildbot only)
<mchua> I think I have a much better idea of what I want from an Activity
test framework, at least, so this has been helpful to me
<adricnet> Yay helpful.
<mchua> adricnet: anything we should do to make this more adric-helpful,
too?
<mchua> (I'm planning on writing up the notes/plans on the wiki, mailing
to the testing list, asking people to shoot at it)
<adricnet> mchua: Not yet. Need to have some examples of these Comm
Testing tests so that we can argue about format and where to keep them
<mchua> adricnet: Aye, right. Concrete examples, working code...
<mchua> I think there's enough constraints in the design doc that we'll
get from this discussion to toss out a couple prototypes, though
<adricnet> mchua: Well, complete-ish PoCs at least
<marcopg> are you thinking to write your custom scripts? or to base on
existing frameworks?
<adricnet> prototypes, yeah
<mchua> adricnet: cool, then we're done, I think
<mchua> adricnet: thank you!
<mchua> marcopg: base on existing, whenever possible
<mchua> marcopg: I am a lazy bum ;)
<marcopg> heh
<adricnet> Laziness is a virtue
<marcopg> there is sugarbot
<marcopg> and also another one which I can't remember right now mmm
<marcopg> (both for gtk)
<adricnet> Selenium ?
<adricnet> Oh, notthat low-level oops
<marcopg> oh dogtail
<mchua> ah! I think you sent me dogtail
<mchua> I haven't gotten a chance to look at it, yet - been working on
remote testing scripts (...incredibly slowly, alas)
<marcopg> yeah me neither...
<marcopg> would be nice to compare sugarbot and dogtail
<mchua> it sure would.
<marcopg> would be also useful to just ask zach about it
<marcopg> perhaps he looked into dogtail before doing his own thing
<mchua> Ooh, that's a great question to ask zach.
<marcopg> zach was supposed to integrate sugarbot into jhbuild btw
<mchua> marcopg: btw, do you have this channel logged, or do you want
me to send you the log from the start of the brainstorm?
<marcopg> but we haven't heard anything from him about ti
<marcopg> busy with school I guess :/
* kevix has quit (Read error: 110 (Connection timed out))
<marcopg> mchua: don't think I have all of it, logs would be great