Scripts for testing multiple XOs

From OLPC
Revision as of 23:14, 7 November 2008 by Garycmartin (talk | contribs) (more code)
Jump to: navigation, search

Introduction

Problem to solve: From a central machine, execute a bash script on a given list of IP addresses (of XOs). End with the output of the script on each XO in text documents in the central machine (one per XO, titled with the XO's IP address and name of the script).

Ideal use scenario

Given the following files...

in xo-ip-list.txt
-------------------
12.34.56.01
12.34.56.02
12.34.56.03
in run-this-script.sh
-----------------------
#!/bin/bash          
ps

I should be able to run a command like this, which will remotely run run-this-script.sh on the list of XOs at the IP addresses in xo-ip-list.txt, and save the results to a folder called testresults.

mchua@master-machine:~$ ./test-multiple-xos --iplist xo-ip-list.txt --script run-this-script.sh --folder testresults

...and then see something like this, where each textfile contains the output of run-this-script.sh on the XO whose IP is in its filename.

mchua@master-machine:~$ ls testresults
12.34.56.01-run-this-script.sh.txt
12.34.56.02-run-this-script.sh.txt
12.34.56.03-run-this-script.sh.txt

Procedure so far

  1. generate a public/private key on your master machine
  2. copy the public key (~/.ssh/id_dsa.pub) onto all the XOs (/home/olpc/.ssh/authorized_keys)
  3. make sure the authorized_keys only has write access for user (chmod g-w usually all that's needed)
  4. then from your master you can remotely run commands as needed (ssh olpc@192.168.1.5 ps aux)

Things left to do

  • Write harness that gives the output of the script on each XO in text documents in the central machine (one per XO, titled with the XO's IP address, name of the script, and timestamp).

Example code

One thing you'll need to resolve is getting the list of all IP addresses, the three XOs here occasionally play tricks on me (DHCP timeout) and change addresses, so I need to clean out host keys in ~/.ssh/known_hosts from time to time. I guess you could also tell ssh to switch off it's strict host checking. Here's a really quick stab at scanning some subnet and getting the list of active IP addresses to try some other script on. --Garycmartin

for (( i=1;i<=254;i+=1 )); do ping -q -c 1 -t 1 192.168.1.${i} > /dev/null && echo 192.168.1.${i}; done
192.168.1.1
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6
Note: I've reduced the ping time-out to just wait 1sec before assuming no one is home, this lets the script complete in 1sec per IP address. If you're just testing, trying to ctrl-c out of any for loop is a pain :-) use ctrl-z and then kill % or wait till the script is done.

Once you have a list of IP addresses, add the ones you want into a file, lets call it xo-ip-list.txt to match the above spec, here's my example file with 3 XO IP addresses:

192.168.1.4
192.168.1.5
192.168.1.6

If you've set-up the public/private keys on all the XOs, you can now run a command on each like this:

for ip in $(cat xo-ip-list.txt); do echo -n "$ip is running build "; ssh olpc@${ip} cat /boot/olpc_build; done
192.168.1.4 is running build 767
192.168.1.5 is running build 767
192.168.1.6 is running build 767

To take this a next step on, and store the output of each command into a file, just needs a output redirect.

for ip in $(cat xo-ip-list.txt); do ssh olpc@${ip} 'echo -n "Running build "; cat /boot/olpc_build' > $ip; done
Note both the echo and the cat commands are run on each XO, this makes it easier to redirect both outputs at once.

The scripts so far all run in serial, from one machine to the next, waiting for each result, so if you have a lot of machines this can get slow. A simple change will allow all XOs to be contacted at once and run in parallel, but be aware that this may busy the network more than you want, especially if you are actually trying to measure network usage or problems related to network congestion. I'd need to try it on a testbed of 100 XOs to see, hard to test with just 3 XOs. If you find the network load is too much (and it also depends on the scripts output size), I'd recommend breaking your list of IP addresses up into several ranges, and running parallel commands on each block.

for ip in $(cat xo-ip-list.txt); do ssh olpc@${ip} 'echo -n "Running build "; cat /boot/olpc_build' > $ip& done