SPI FLASH Recovery/XO-1: Difference between revisions

From OLPC
Jump to navigation Jump to search
(→‎Serial Protocol: Simplified and improved sentence structure. Added references for notation.)
 
(31 intermediate revisions by 11 users not shown)
Line 1: Line 1:
{{translations}}

This procedure is NOT recommended --- it is difficult to perform correctly. An easier way to repair a laptop with this problem is described in [[XO Replacing SPI flash|Replacing the SPI Flash]].

= SPI FLASH Recovery =
= SPI FLASH Recovery =


If the SPI boot FLASH somehow gets bad data in it, so the firmware won't run, you can reload it with a special serial cable.
If the SPI boot FLASH (U37) somehow gets bad data in it, so the firmware won't run, you can reload it with a special serial cable.


== Hookup ==
== Hookup ==


=== Btest Boards ===
=== B-test and later boards ===

The cable connects to CN24 (Recovery Connector) on the Btest boards (CN20 on Atest) "G1G1-"C" Boards don't have CN24 (Recovery Connector) or Recovery Mode Jumper Block (as Squared in the picture) but does have a 4-pin J1 Connector option. [[Image:BtestRecovery.jpg]]


In addition to connecting the special cable to the Recovery Connector, during recovery you also must install jumpers in both positions of the Recovery Mode Jumper Block (CN31). It doesn't matter whether you install the jumpers horizontally or vertically; either orientation will work. If you don't have any jumpers the right size, you can wedge a metal object into the block, connecting all the pins together. A screw about 0.1" in diameter works well. Or wrap a piece of wire around the 4 posts.


Some related images showing the proper attachment of a CMOS oscillator and 5-pin EC serial header to the XO's motherboard:


{|
The cable connects to CN24 (Recovery Connector) on the Btest boards (CN20 on Atest). [[Image:BtestRecovery.jpg]]
|width="50%" valign="top"|
[[Image:Usb-to-serial-jumpers.jpg|thumb|USB-to-serial board, with jumper settings]]
|width="50%" valign="top"|
[[Image:Osc-and-jumpers.jpg|thumb|The oscillator attached to the EC serial header with the board's jumper settings]]
|-
|valign="top"| [[Image:Osc-in-use.jpg |thumb|The oscillator connected to a board ready for the 4-pin serial header to be attached to the pins on the left side of the header in this photograph.]]
|valign="top"| [[Image:Osc-diagram.jpg|thumb|A diagram of the bottom of the oscillator showing the four gold pads on its corners, indicating which corner to connect to each pin of the EC serial header.]]
|}


If the oscillator is properly attached, an oscilloscope will show a nice ~65MHz sine/triangle/square wave on pin 5 of the EC serial header whenever the XO is connected to power. I found that it was fine to leave the control pin on the oscillator floating, although shorting it to the voltage input pin should also work.


The following dictionary is probably older than the one in spiflashrecovery.tar.gz download below and you probably have no use for it: [[:Image:Spiflash.dic|updated, but old flashing forth dictionary for recent machines]].
In addition to connecting the special cable to the Recovery Connector, during recovery you also must install jumpers in both positions of the Recovery Mode Jumper Block. It doesn't matter whether you install the jumpers horizontally or vertically; either orientation will work. If you don't have any jumpers the right size, you can wedge a metal object into the block, connecting all the pins together. A screw about 0.1" in diameter works well. Or wrap a piece of wire around the 4 posts.


=== Atest Boards ===
=== Atest Boards ===
Line 28: Line 49:
== Connecting the Linux Host to the OLPC Board ==
== Connecting the Linux Host to the OLPC Board ==


Connect a "null modem" serial cable from your Linux host machine to the DB9 connector on the special recovery cable. Typically, the null modem cable will have DB9 female connectors on both ends. If possible, connect it to the host COM port that is associated with /dev/ttyS0 - typically COM1. If your host system doesn't have a built in COM port, you'll need to use a USB-to-serial adapter and figure out the associated device name.
Connect a "null modem" serial cable from your Linux host machine to the [http://en.wikipedia.org/wiki/D-subminiature DE-9] connector on the special recovery cable. Typically, the null modem cable will have DE-9 female connectors on both ends. If possible, connect it to the host COM port that is associated with /dev/ttyS0 - typically COM1. If your host system doesn't have a built in COM port, you'll need to use a USB-to-serial adapter and figure out the associated device name.


== Getting the Software ==
== Getting the Software ==


http://firmworks.com/linux/spirecovery.tgz contains a Linux application to perform the recovery procedure. The archive contains two files: forth and spiflash.dic . The two files work together. forth is a Linux executable file that provides I/O interfaces that let the Forth programming language work under Linux. spiflash.dic is a Forth language workspace (AKA "dictionary") that includes procedures for performing the serial line recovery protocol.
http://dev.laptop.org/~wmb/spirecovery.tgz contains a Linux application to perform the recovery procedure. The archive contains two files: forth and spiflash.dic . The two files work together. forth is a Linux executable file that provides I/O interfaces that let the Forth programming language work under Linux. spiflash.dic is a Forth language workspace (AKA "dictionary") that includes procedures for performing the serial line recovery protocol.


Get that archive and unpack it into a working directory that contains the boot FLASH image file that you want to install in the OLPC board (that file is typically named linuxbios.rom).
Get that archive and unpack it into a working directory that contains the boot FLASH image file that you want to install in the OLPC board (that file is typically named linuxbios.rom).
Line 58: Line 79:
ok line /dev/ttyS4
ok line /dev/ttyS4


ok flash q2d14.rom
Connect power to the OLPC board, then:


You will be prompted to remove all power from the machine and plug it back in to power.
ok flash linuxbios-Q2A11.rom
Press enter, then, if all goes well:

Merging Manufacturing Data
<There will be a several-minute delay here. Be patient! It is working.>
SPI FLASH is type 13 - Spansion, Winbond or ST
SPI FLASH is type 13 - Spansion, Winbond or ST
Erasing
Erasing
Line 78: Line 102:
The flashing process should take between 1 and 5 minutes, depending on how much unused (set to 0xff) space there is in the image.
The flashing process should take between 1 and 5 minutes, depending on how much unused (set to 0xff) space there is in the image.


Some notes on using the FTDI-based USB to Serial adapters "Universal Serial Converter" follow. I found this board very particular about power-on sequence, so here is what worked for me. The target system had jumpers installed, all power removed, and connectors from CN24 to the USB adapter. Then I plugged power into the target, waited a few seconds, and connected the USB cable to the host. Following this I did not plug or unplug any further cables. I typed 'reflash q2e45.rom' and pressed enter.
Note that, unlike some earlier versions of our FLASHing software, this program always updates the entire FLASH in one step. There is no separate procedure for updating the EC code versus the BIOS code. The reason is because there is no advantage to preserving the EC code on the new boards that do not have have a separate PLCC socket. Lacking a PLCC socket, if either part of the FLASH is bad, the only alternative is to update the FLASH via the serial recovery procedure. The simplified procedure guarantees that consistent EC and BIOS code will be loaded simulataneously.

Note that, unlike some earlier versions of our FLASHing software, this program always updates the entire FLASH in one step. There is no separate procedure for updating the EC code versus the BIOS code. The reason is because there is no advantage to preserving the EC code on the new boards that do not have have a separate PLCC socket. Lacking a PLCC socket, if either part of the FLASH is bad, the only alternative is to update the FLASH via the serial recovery procedure. The simplified procedure guarantees that consistent EC and BIOS code will be loaded simultaneously.


== Verification ==
== Verification ==
Line 95: Line 121:


This will read the entire 1 MiB FLASH, which will take about 17 minutes.
This will read the entire 1 MiB FLASH, which will take about 17 minutes.

<Mitch_Bradley> From an xo, the recipe would be:
ok console-io use-serial-ec <enter> (turn on dead
machine)
Maybe: ok spi-start spi-identify .spi-id
ok read-flash u:\foo.img


== Exiting ==
== Exiting ==
Line 116: Line 148:
== Serial Protocol ==
== Serial Protocol ==


This section is just for the curious (there are lots of curious people on the OLPC project). You don't need to know anything in this section to accomplish the recovery task.
This section is for the many curious people on the OLPC project. It is not essential to completion of the recovery task.


The way the serial protocol works is by letting you read and write, via the serial line, the EC registers than control the SPI FLASH programming. Writing to an EC register requires sending two serial line bytes - a register number and a data byte. Reading an EC register also requires two serial bytes - sending the register number and receiving the data.
The serial protocol works by reading and writing, via the serial line, the EC registers controlling the SPI FLASH programming. Writing to an EC register requires sending two serial line bytes - a register number and a data byte. Reading an EC register also requires two serial bytes - the register number sent and the data byte received.


For most of the SPI FLASH chip types (Spansion, Winbond, ST), writing is reasonably efficient. After a few bytes of setup, you can transit a stream of 256 consecutive data bytes with two serial bytes per FLASH data byte - a fixed register number followed by the data byte. The SST part's write procedure is a bit more complicated. (The astute reader will no doubt wonder why the EC designers did not improve the protocol so that only one serial byte per FLASH byte is required. Your erstwhile author wonders that too.)
For most SPI FLASH chip types, including Spansion, Winbond and ST, writing is reasonably efficient. After a few bytes of setup, a stream of 256 consecutive data bytes are transmitted. There are two serial bytes per FLASH data byte; a fixed register number followed by a data byte. For the SST part, the write procedure is more complex. The astute reader might wonder why the EC designers did not make the protocol require only one serial byte per FLASH byte. Your erstwhile author wonders that also.


At 115200 baud, you can transfer roughly 10K serial bytes per second, so you can write about 5K FLASH bytes per second (there is also some time for the FLASH part to actually do the writes, but that is fast compared to the serial line).
At 115200 [http://en.wikipedia.org/wiki/Baud baud], roughly 10 [http://en.wikipedia.org/wiki/Binary_prefix KiB/s] serial can be transferred. That nets about 5 KiB/s FLASH. The time for the FLASH part to perform a write is small compared to the time for transmission via the serial line.


Reading is not so efficient. The way the EC works, in order to perform an SPI read cycle, you must first issue a "dummy write" cycle then read a register to get the data. The purpose of the dummy write is to generate 8 clock cycles on the SPI bus, during which the FLASH chip sends the data back. The net result is that reading a byte with the serial protocol transfer 4 serial bytes - two sends for the dummy write, and a send and receive for the data read. So reading inherently takes twice as long as writing. It would work out to 2.5 Kbytes per second, and indeed I can achieve that throughput with a host program running "standalone" without an operating system.
Reading is not so efficient because of the characteristics of the EC. An SPI read cycle requires a "dummy write" cycle before reading a register for the data. The dummy write generates 8 clock cycles on the SPI bus, during which the FLASH chip returns the data. The net result is that reading a byte requires 4 serial bytes; two sends for the dummy write, and a send and receive for the data read. So reading inherently takes twice as long as writing. It works out to 2.5 KiB/s and indeed that throughput can be achieved with a host program running "standalone" without an operating system.


Performance was worse on the installation I tested and is anticipated to be no better on any Linux system. Linux throttles the serial communication to exactly one "write, write, write, read" sequence per millisecond. That works out to reading 1K bytes per second. I've tried many variations of raw serial I/O with different settings for VMIN and VTIME, both blocking and non-blocking, to no avail. Reading more than one byte with a single read() call is not practical, because the "write, write, write" sequence for the next byte can't start until the "read" for the present byte has finished.
But in practice, it's worse than that under Linux, at least on the Linux installation that I have tested it on. For some reason,
Linux throttles the serial "write, write, write, read" sequence so I get exactly one iteration of that sequence per millisecond. That works out to reading 1K bytes per second. I've tried many variations of raw serial I/O with different settings for VMIN and VTIME, both blocking and non-blocking, to no avail. Note that it's not practical to read more than one byte with a single read() call, because the "write, write, write" sequence for the next byte can't start until the "read" for the previous byte has finished.


Note: it is possible to run most PC serial ports at faster than 115200 baud, but it requires special driver code that is chipset-dependent. Using that feature would have greatly complicated these procedures.
Most PC serial ports can run faster than 115200 baud with chipset-dependent driver code. That would have greatly complicated these procedures.


== Change notes for spiflash.dic ==
== Change notes for spiflash.dic ==


* 2.2 Added clearing of SPI status register to turn of block protect bits in SST parts
* 2.2 Added clearing of SPI status register to turn off block protect bits in SST parts.
* 2.1 Fixed a crash problem that only happened when the program couldn't identify the part
* 2.1 Fixed a crash problem that only happened when the program couldn't identify the part.
* 2.0 Streamlined the user interface
* 2.0 Streamlined the user interface.
* 1.0 First released version
* 1.0 First released version.


[[Category:Hardware]]
[[Category:Hardware]]
[[Category:Developers]]
[[Category:Developers]]
[[Category:Firmware]]
[[Category:Repair]]

Latest revision as of 16:45, 24 October 2012

  english | espanol HowTo [ID# 276261]  +/-  


This procedure is NOT recommended --- it is difficult to perform correctly. An easier way to repair a laptop with this problem is described in Replacing the SPI Flash.

SPI FLASH Recovery

If the SPI boot FLASH (U37) somehow gets bad data in it, so the firmware won't run, you can reload it with a special serial cable.

Hookup

B-test and later boards

The cable connects to CN24 (Recovery Connector) on the Btest boards (CN20 on Atest) "G1G1-"C" Boards don't have CN24 (Recovery Connector) or Recovery Mode Jumper Block (as Squared in the picture) but does have a 4-pin J1 Connector option. BtestRecovery.jpg


In addition to connecting the special cable to the Recovery Connector, during recovery you also must install jumpers in both positions of the Recovery Mode Jumper Block (CN31). It doesn't matter whether you install the jumpers horizontally or vertically; either orientation will work. If you don't have any jumpers the right size, you can wedge a metal object into the block, connecting all the pins together. A screw about 0.1" in diameter works well. Or wrap a piece of wire around the 4 posts.


Some related images showing the proper attachment of a CMOS oscillator and 5-pin EC serial header to the XO's motherboard:

USB-to-serial board, with jumper settings
The oscillator attached to the EC serial header with the board's jumper settings
The oscillator connected to a board ready for the 4-pin serial header to be attached to the pins on the left side of the header in this photograph.
A diagram of the bottom of the oscillator showing the four gold pads on its corners, indicating which corner to connect to each pin of the EC serial header.

If the oscillator is properly attached, an oscilloscope will show a nice ~65MHz sine/triangle/square wave on pin 5 of the EC serial header whenever the XO is connected to power. I found that it was fine to leave the control pin on the oscillator floating, although shorting it to the voltage input pin should also work.

The following dictionary is probably older than the one in spiflashrecovery.tar.gz download below and you probably have no use for it: updated, but old flashing forth dictionary for recent machines.

Atest Boards

Jumper

For the Atest board, there is no jumper block. You have to short across a couple of pads. They are just to the left of the ene chip, near the middle of the left row of pins on that part; see media:AtestRecoveryJumper.jpg. It's only necessary to have the short in place right after power up, so it's sufficient to hold something (e.g. a screwdriver tip) across them while you turn on the power.

Recovery Cable

The recovery connector is CN20 on the A-test board. It can be found near the EC. It is a five-pin connector. (Do not confuse with J1 which is a four-pin connector to the north).

Best option is to use the same five-pin SPI FLASH recovery cable as is used with the B-test boards, B1, B2 and B4 systems.

Alternatively, you can use the normal four-pin OLPC serial adapter cable as a recovery cable (that's not true for the B-test boards; they require an external oscillator that's not present on the normal serial adapter). The cable plug has four pins, and the recovery connector has five. You have to insert the plug into the right-hand side of the connector (with the board in the orientation shown above), so that pin 5, the one on the left, is not connected.

Connecting the Linux Host to the OLPC Board

Connect a "null modem" serial cable from your Linux host machine to the DE-9 connector on the special recovery cable. Typically, the null modem cable will have DE-9 female connectors on both ends. If possible, connect it to the host COM port that is associated with /dev/ttyS0 - typically COM1. If your host system doesn't have a built in COM port, you'll need to use a USB-to-serial adapter and figure out the associated device name.

Getting the Software

http://dev.laptop.org/~wmb/spirecovery.tgz contains a Linux application to perform the recovery procedure. The archive contains two files: forth and spiflash.dic . The two files work together. forth is a Linux executable file that provides I/O interfaces that let the Forth programming language work under Linux. spiflash.dic is a Forth language workspace (AKA "dictionary") that includes procedures for performing the serial line recovery protocol.

Get that archive and unpack it into a working directory that contains the boot FLASH image file that you want to install in the OLPC board (that file is typically named linuxbios.rom).

Make sure that the "forth" file has execute permissions.

chmod a+x forth

Running the Program

Start the program with

./forth spiflash.dic

It will display a page of instructions describing the various commands, then issue an "ok " prompt.

Look at the software version number on the first line of the instructions that are displayed. If it says version 1.0, you need to download a new copy of the software from the web site. Version 1.0 works, but is more complicated to use. The instructions below are for version 2.0 and later. The most recent version is 2.2.

Here is an example of a typical command sequence for reflashing the entire SPI FLASH, including the EC code and the BIOS.

First make sure that all the connectors are hooked up and the jumpers installed.

The default serial port is /dev/ttyS0. Do the next line, substituting your port name, only if you are using a different port:

 ok line /dev/ttyS4
 ok flash q2d14.rom

You will be prompted to remove all power from the machine and plug it back in to power. Press enter, then, if all goes well:

 Merging Manufacturing Data

<There will be a several-minute delay here. Be patient! It is working.>

 SPI FLASH is type 13 - Spansion, Winbond or ST
 Erasing
 f0000
 Writing
 ff000
 Type verify if you want to verify the data just written.
 Verification will take about 17 minutes...

The default image filename is olpc.rom . That name will be used if you omit the filename on the "flash" command line.

If you don't see the "SPI FLASH is ..." message, i.e. if it fails or hangs, it means that the host system can't talk to the OLPC board. Check the connectors, jumpers, etc, and make sure that you are on the right serial port on the host.

If you have re-issue the "flash" command, like for example you control-C out of the program and restart it, you'll need to power-cycle the OLPC board. The OLPC board will only accept incoming connect attempts after a power-up. If you have already connected to it once, you can't reconnect without power-cycling.

The flashing process should take between 1 and 5 minutes, depending on how much unused (set to 0xff) space there is in the image.

Some notes on using the FTDI-based USB to Serial adapters "Universal Serial Converter" follow. I found this board very particular about power-on sequence, so here is what worked for me. The target system had jumpers installed, all power removed, and connectors from CN24 to the USB adapter. Then I plugged power into the target, waited a few seconds, and connected the USB cable to the host. Following this I did not plug or unplug any further cables. I typed 'reflash q2e45.rom' and pressed enter.

Note that, unlike some earlier versions of our FLASHing software, this program always updates the entire FLASH in one step. There is no separate procedure for updating the EC code versus the BIOS code. The reason is because there is no advantage to preserving the EC code on the new boards that do not have have a separate PLCC socket. Lacking a PLCC socket, if either part of the FLASH is bad, the only alternative is to update the FLASH via the serial recovery procedure. The simplified procedure guarantees that consistent EC and BIOS code will be loaded simultaneously.

Verification

Verifying is quite a bit slower than programming, due to the characteristics of the serial protocol coupled with Linux scheduling issues. The net result is that programming goes at about 5K bytes per second (for most parts), while reading (necessary for verification) goes at 1K bytes per second.

So programming all the bytes of a 1 MiB SPI FLASH takes about 3.5 minutes, while reading or verifying them takes 17 minutes. Ouch!

Since verifying is so slow, you might consider not doing it. Most of the time the writing process will work just fine, and you can boot the system and let the running firmware do the verification at memory access speed.

Reading the SPI FLASH

To read the contents of the SPI FLASH and save it to a file:

 ok read-flash myfile.rom

This will read the entire 1 MiB FLASH, which will take about 17 minutes.

<Mitch_Bradley> From an xo, the recipe would be:

                          ok  console-io   use-serial-ec    <enter>  (turn on dead
                          machine)  
                          Maybe: ok spi-start spi-identify .spi-id
                          ok  read-flash  u:\foo.img

Exiting

To exit, type

 ok bye

(or ^C).

Powering up the btest Board

  • Turn off the power to the btest board
  • Remove the recovery jumpers
  • Turn the power back on
  • (For recent versions of the EC code) Press the "Power Button" (SW2) which is on the bottom of the board underneath the SD socket. The nearby green LED should come on.
  • (If your board is one of the early preB ones that doesn't have the fix for the CaFe configuration delay) Briefly short JP2, which is a pair of triangular pads below the 5536 chip (where "below" is relative to the orientation where the power connector is at the lower right corner).

Having done all that, LinuxBIOS should start.

Serial Protocol

This section is for the many curious people on the OLPC project. It is not essential to completion of the recovery task.

The serial protocol works by reading and writing, via the serial line, the EC registers controlling the SPI FLASH programming. Writing to an EC register requires sending two serial line bytes - a register number and a data byte. Reading an EC register also requires two serial bytes - the register number sent and the data byte received.

For most SPI FLASH chip types, including Spansion, Winbond and ST, writing is reasonably efficient. After a few bytes of setup, a stream of 256 consecutive data bytes are transmitted. There are two serial bytes per FLASH data byte; a fixed register number followed by a data byte. For the SST part, the write procedure is more complex. The astute reader might wonder why the EC designers did not make the protocol require only one serial byte per FLASH byte. Your erstwhile author wonders that also.

At 115200 baud, roughly 10 KiB/s serial can be transferred. That nets about 5 KiB/s FLASH. The time for the FLASH part to perform a write is small compared to the time for transmission via the serial line.

Reading is not so efficient because of the characteristics of the EC. An SPI read cycle requires a "dummy write" cycle before reading a register for the data. The dummy write generates 8 clock cycles on the SPI bus, during which the FLASH chip returns the data. The net result is that reading a byte requires 4 serial bytes; two sends for the dummy write, and a send and receive for the data read. So reading inherently takes twice as long as writing. It works out to 2.5 KiB/s and indeed that throughput can be achieved with a host program running "standalone" without an operating system.

Performance was worse on the installation I tested and is anticipated to be no better on any Linux system. Linux throttles the serial communication to exactly one "write, write, write, read" sequence per millisecond. That works out to reading 1K bytes per second. I've tried many variations of raw serial I/O with different settings for VMIN and VTIME, both blocking and non-blocking, to no avail. Reading more than one byte with a single read() call is not practical, because the "write, write, write" sequence for the next byte can't start until the "read" for the present byte has finished.

Most PC serial ports can run faster than 115200 baud with chipset-dependent driver code. That would have greatly complicated these procedures.

Change notes for spiflash.dic

  • 2.2 Added clearing of SPI status register to turn off block protect bits in SST parts.
  • 2.1 Fixed a crash problem that only happened when the program couldn't identify the part.
  • 2.0 Streamlined the user interface.
  • 1.0 First released version.