XO Flash Bad Blocks

From OLPC
Revision as of 21:17, 18 June 2011 by FGrose (talk | contribs) (temporary update)
Jump to navigation Jump to search
  This page is monitored by the OLPC team.
  english | español HowTo [ID# 257248]  +/-  

This page is how to fix an XO laptop which is having problems with its main Flash ROM storage. It is part of the XO Troubleshooting Guide.

Problem Description

The NAND Flash which provides the primary storage for a laptop (it is used for Linux and Sugar) has blocks of the chip which don't work. Usually, these are identified at the factory (using an extremely sensitive test) and don't change over the life of the device. Occasionally, however, a device develops additional bad storage blocks. If a laptop has had repeated problems where the solution was reinstalling the software, the bad block table may need to be rebuilt.

Diagnostics

Unfortunately the standard mtd_debug tool (from the mtd-utils package) does not provide information about the BBT. You can download an improved mtd_debug from http://dev.laptop.org/~martin/ -- to use

 modprobe mtdchar
 /path/to/mtd_debug info /dev/mtd0

the output will include 'mtd.badblockscount'.

Repair

You can ask Open Firmware to check the NAND Flash ROM and add newly-bad blocks to the "bad block table" that tells the system not to use them. This requires a developer key for the laptop.

Using firmware Q2D16 or earlier

If you are using Open Firmware Q2D16 or earlier, the procedure requires that you type a few lines because of a firmware bug:

 ok dev /nandflash
 ok : xx dup aa ;
 ok patch xx aa full
 ok dend
 ok test /nandflash::fixbbt

Using firmware Q2D17, Q2E01 or later

If your Open Firmware is newer than Q2D16, just type:

 ok test /nandflash::fixbbt

This procedure takes several minutes, because it tests every block on the NAND Flash with several data patterns. The procedure is safe for good NAND Flash, because it saves the previous block data before the test and restores it afterwards, so you won't lose any data on good Flash blocks. There is, however, a small chance of data loss in the event that a block goes bad during (i.e. as a result of) the test, preventing restoration of the previous contents.

Something different?

Either this NAND is close to complete failure, or else the bad block 
table has become corrupted.

You could try recreating the bad block table as follows:
ok select /nandflash  scrub!  unselect