WLAN Race Condition Demonstration

From OLPC
Jump to: navigation, search

Demonstration of WLAN Race Condition

MV8787 WLAN version 14.66.09.p96 exhibits different behavior under Open Firmware on XO-4 compared to version 14.66.09.p80 .

Specifically, there appears to be a race condition between the HOST_INTSTATUS register and the WRBITMAP register. WRBITMAP returns the wrong value if it is read too soon after the 8787 firmware is initialized. The sequence is as follows.

Sequence

  • Write 0x03 to HOST_INT_MASK register (function 1 reg 2)
  • Load 8787 firmware
  • Write 0x02 to CONFIG register (function 1 reg 0)
  • (Optional time delay).
  • Read HOST_INTSTATUS register (function 1 reg 3) - it returns 2, saying that it is okay to read the WRBITMAP register.
  • Read WRBITMAP register (function 1 reg 6) ...
    • In version 14.66.09.p80 (Works) , the return value is always 0xffff, saying that the command port and the write data ports are available.
    • In version 14.66.09.p96 ...
      • (Fails) If the optional delay is less than (approximately) 90 ms, the return value is 0x0001, saying that only the command port is available.
      • (Works) If the optional delay is greater than (approximately) 90 ms, the return value is 0xffff, saying that the command port and the write data ports are available.

In the failing case, the WRBITMAP register will eventually return the correct value 0xffff - but the 0x02 mask bit in the HOST_INTSTATUS register is not set, so it should not be valid for software to read WRBITMAP.

Demonstration Firmware

I have prepared a version of OFW with tools to demonstrate this problem.

Put http://dev.laptop.org/~wmb/q7b08ma.rom on a USB stick and put it into the XO-4.

Power on the XO-4 and hit the ESC key when you hear the startup jingle. You should see an "ok" prompt.

Reflash the XO-4 with:

  ok flash u:\q7b08ma.rom

Configuring an Access Point

If you can set up an open access point named "OLPCOFW", you can run the tests very easily, as OFW defaults to that name. Otherwise you will have to issue these commands after every power-up:

  ok essid MY_ESSID
  ok wpa MY_WPA_PASSWORD

The essid line is unnecessary if the access point is named "OLPCOFW", and the wpa line is unnecessary if the access point is open.

Example Test Runs

Here is a successful run using the old 8787 firmware:

 ok use-old-fw
 ok set-delay 0
 ok ping 192.168.200.1
 host-intstatus-reg is 0x2  wr-bitmap-reg is 0xffff  delay is 0
 Scan for: OLPCOFW found
 Associate with: OLPCOFW
 DHCP got 192.168.200.64
 203 ms

Here is another successful run using the new 8787 firmware plus a 100 ms delay (which hides the problem):

 ok use-new-fw
 ok set-delay 100
 ok ping 192.168.200.1
 host-intstatus-reg is 0x2  wr-bitmap-reg is 0xffff  delay is 100
 Scan for: OLPCOFW found
 Associate with: OLPCOFW
 DHCP got 192.168.200.64
 203 ms

Here is a failing run, using new firmware, with the delay too short to hide the problem:

 ok use-new-fw
 ok set-delay 80
 ok ping 192.168.200.1
 host-intstatus-reg is 0x2  wr-bitmap-reg is 0x1  delay is 80
   wr-bitmap-value was wrong - rereading after 100 ms ...
 host-intstatus-reg is 0x0  wr-bitmap-reg is 0xffff
 Associate with: OLPCOFW

In the failing run, the first problem indication is "wr-bitmap-reg is 0x1" (instead of 0xffff). When the test software notices that situation, it waits 100 ms and re-reads both the HOST_INTSTATUS and WRBITMAP registers. Notice that WRBITMAP has changed to the correct 0xffff, but HOST_INTSTATUS is 0, so normal software would not know that it should read WRBITMAP.

The final result in the failing case is that the software hangs. The hang is caused by the transmit routine not having a write data port available (because the WRBITMAP mask is 0x1, with no write data port bits set).