XO Troubleshooting Guide

From OLPC
Revision as of 06:06, 1 June 2008 by Nicabod (talk | contribs) (The laptop can't charge the battery)
Jump to: navigation, search

This is a troubleshooting guide for the XO laptop. It is geared toward troubleshooting production units running firmware version Q2D03 or greater.

Still very much a work in progress!! Feel free to add

Contents

Problems powering on

For an introduction to the different boot options selectable while powering on, see the Cheat codes.

All of these tests assume that a known good source of power is available, either a charged battery or power adapter. To debug problems with the power source, and battery charging issues, see Power and Battery problems.

Is the Power LED On ?

When the power button is pressed once, the power LED doesn't turn on.

(insert photo of 'stuck' power button vs. 'unstuck' power button)

Please try Resetting the Embedded Controller.

If that doesn't work, test that the power adapter and battery used in the test are working (using another laptop). A laptop may be unable to power itself from its battery or power adapter (for a number of reasons).

If both power sources are working, then the motherboard is broken. The possible failure modes are numerous, and deserve a second troubleshooting guide.

(This section needs a step-by-step through the normal start-up sequence with branching options for each possibility. This section needs to take into account possibilities which include broken power LED, boot sound disabled in software, screen backlight broken or disconnected, possibilities of booting from a non-standard source (was an SD card accidentally left in the slot?) or secure vs. insecure startup, etc. I would also argue only diagnosis steps should appear here, with recovery steps located in a different section.)


The display doesn't activate

As the XO starts, the power LED turns on and the screen should be initialized with text or graphics displayed. The backlight for the screen may or may not be active.

Power LED on but display dosen't activate

The power LED indicates that the EC has enabled the power to the CPU. OpenFirmware is next in the boot sequence. The next major visible boot step is turning on the display. The LCD display should begin to show text or graphics.

If this does not happen then the boot sequence may not be reaching OpenFirmware or OpenFirmware may be crashing early in the boot process. The way to tell is to look at the Microphone activity LED. The default state of the Mic LED is lit. One of the first steps OpenFirmware does is to turn off the Mic LED. If the power LED and the Microphone LED are both lit then a serious boot error has occured and the motherboard needs replacing.

If the power LED is lit and the Mic LED is not then backlight for the screen may or may not be active, the LCD may be non-functional or the q2d06 Brick may have occured.

Display does not initialize. Boot sound does not play

This usually indicates a broken motherboard or a Real Time Clock battery problem in conjunction with early firmware.

Display does not initialize, but the boot sound plays

If the display doesn't initialize, but the boot sound plays, see Display Problems.

Early production machines had firmware version Q2D06, which had a bug that caused the machine not to start if the Real Time Clock chip's date had been reset to 0. This could happen if that chip lost all power, including its backup power from an internal "coin cell" battery. To make matters worse, some of the early units were manufactured with defective holders for that coin cell battery, so that the battery could became loose or even pop out entirely. To determine conclusively that this is the problem, you must open the enclosure, and to repair it you will need special tools as described below. To diagnose the problem, open the enclosure. If the coin cell battery is loose or out of its holder and the system is an early production unit, this is probably the problem. To be completely sure, you must connect a serial terminal to J1 on the motherboard, via a special low-voltage interface adapter. If the system has this problem, you will see "Page Fault" on the serial console shortly after power up.

Repairing a "bad RTC setting" system

Repairing a system with the "early firmware, bad RTC setting" problem (aka the q2d06 brick) requires several steps. If you have lots of XO machines, you might consider using the bad system for spare parts. Otherwise, you can repair it as follows:

  • First, you must re-seat the coin cell battery in its holder and secure it so it doesn't come out again. One way to secure it is to put a drop of glue where the battery contacts the holder, away from the metallic contact. The best glue that I have found for this purpose is clear solvent-based household cement. Technically, it is "nitro cellulose" cement. It is also known as "model airplane glue", marketed under various trade names such as "Duco Cement" and "Tarzan's Grip". Loctite "Stik'n Seal" is a consumer adhesive; solvents are toluene and hexane (keep away from children!). It's extremely flammable, as well. Stock number is 01-23782, 1 oz. metal tube. (Alleskleber, in Germany, maybe?) Stronger adhesives like epoxy or cyanoacrylate (super glue) would probably work too, but it might be difficult to remove the battery later without damaging the holder. Don't even think of Gorilla Glue; that's polyurethane, which foams as it cures. RTV silicone would be good, except that the common variety releases acetic acid as it cures, which is corrosive.
  • After securing the battery, you must replace the boot firmware with a version that is newer than Q2D06 (use the latest official release). There are two ways:
  1. Option 1 requires electronic rework (soldering and desoldering) skills and tools, and access to a standalone SPI FLASH programmer device. You unsolder the SPI FLASH chip, reprogram it (or a new one) with a recent firmware version, then resolder the fresh chip.
  2. Option 2 requires a serial terminal with a low-voltage interface adapter that will connect to J1 inside the XO. The steps are:
    1. Insert a USB key containing the new firwmare version.
    2. Connect the serial terminal (115200,8,n,1) and power on the machine. You should see "Page Fault" on the serial terminal, followed by an "ok" prompt.
    3. Type the following command line at the ok prompt, substituting the correct .rom filename:
 ok probe-pci probe-usb  flash u:\q2d14.rom

After you have updated the firmware, you must set the RTC clock. There are several ways:

  • If you can boot to Linux (because the machine is permanently activated or you have a developer key), you can use the Linux "hwclock" command.
  • If you have an open wireless network that is connected to the internet, you can type this at the ok prompt (replacing "MY SSID" as appropriate):
ok ssid MY SSID
ok ntp-set-clock
  • Otherwise, you can use OFW to set the date as follows. First determine the date and time in UTC, then:
ok select /rtc
ok decimal  0 10 22  27 5 2008  set-time
ok unselect

The above example is for 2008-5-27 22:10:00 UTC, i.e. the numbers are entered starting with the least significant - second, minute, hour, day, month, year.

  • If you can't get to the ok prompt in the normal way because the machine is secure and you don't have a developer key, you can still set the date via a serial terminal. Power up the system and immediately start typing "iiiiii..." on the serial terminal. You should get an ok prompt within about 2 seconds. (You can't reflash the firmware from this ok prompt, but you can set the clock).

Erase any extra 'i' characters with Backspace, then type the "select /rtc ..." command sequence as shown above.

Machine boots normally, but no boot sound plays

If no boot sound is played, but the machine boots normally and has audio, it is likely that the user has changed the default boot volume to 0. While the boot sound is playing, a user can adjust the volume using the volume adjust keys. This modified volume setting is saved and used for future boots. Try increasing the volume right after starting the laptop a few times, and see if the boot sound returns.

If no boot sound is played, and the machine boots normally but has no audio see Audio Problems.

The display says "Connect to power to proceed"

Not quite the correct wording. Anyone remember the exact words ?

Early versions (before Q2D14) of the firmware would stop execution if a firmware update was scheduled, but two sources of power (a battery and a power adapter) aren't present ( [#5422]). If this is the problem, provide both sources of power and reboot. The laptop should proceed with a firmware update and boot normally.

The display is showing an XO icon

This means that Open Firmware has started the boot process.

XO icon with a "sad face"

This means that Open Firmware couldn't find a signed operating system on the internal flash memory. (It will also look on USB memory sticks and SD cards.)

You can get more information from Open Firmware by holding the "check" button (above the power button) after powering on. That will make Open Firmware display more detailed messages about what it is doing during the secure boot process. The messages are in English only.

Try upgrading or re-installing the software.

XO icon with a single dot below it

If laptops powers up, but stops when just displaying the XO icon in the middle, with a single dot below it, it means that something is wrong when the Linux OS starts operation. Try booting with the

( TEXT MISSING HERE! (^_^) )

One solution to try is upgrading or re-installing the software.

XO icon with a serial number and three icons below it

If the laptop powers up, but stops when displaying the XO icon in the middle of the screen, followed by a serial number (e.g. CSN74902B22) and three icons (SD disk, USB disk, Network signal strength), it is looking for its activation lease. This should eventually print "Activation lease not found" at the top of the screen and power-off soon thereafter.

The solution is to re-activate the laptop. Obtain a copy of the lease (or a new lease) from your country activation manager, place it (named "lease.sig") on the root directory of a USB key and boot the laptop. See What to do with your activation keys.

Display problems

The display doesn't turn on

Use a strong light shining on the display to confirm that the problem isn't that the backlight won't turn on. If you see text or graphics on the display, the problem is with the backlight.

Use a known good display to check whether the display or the motherboard is broken.

If the known good display also does not turn on, the motherboard is broken and should be replaced.

One half of the display looks bad

Is the display cable properly connected ?

Disassemble the laptop to access the larger flex cable from the display to the motherboard. Make sure that it is properly seated in its connector and properly clamped down. The white stripe on the cable should be close to and parallel to the black tab clamping the cable down.

Photo needed here, a closeup of the display connector

If the connector is broken (the small black tab fails to stay in place), the motherboard will need replacement.

Is the display broken ?

Use a known good display to check whether the display or the motherboard is broken.

If the known good display shows the same problem, the motherboard is broken and should be replaced.

The display fades to white

The backlight won't turn on

The backlight isn't even

In this case, a vertical pattern of light and dark is seen on the display. The difference between light and dark regions being strongest at the bottom of the screen.

Is the backlight cable properly connected ?

Disassemble the laptop to access the small flex cable from the display to the motherboard. Make sure that it is properly seated in its connector, and properly clamped down.

Photo needed here, a closeup of the backlight connector

If the connector is broken (the small black tab fails to stay in place), the motherboard will need replacement.

A temporary solution is to wind a small strip of paper several times, jamming a small piece of it into the connector to hold the flex cable in place. A piece of tape covering the connector provides additional stability.

Is the lightbar broken ?

If the cable is well seated, then it is likely that the actual lightbar in the display has failed. This lightbar can be replaced (the display itself is probably fine.)

Is the motherboard backlight driver broken ?

If replacing the lightbar fails to correct the problem, it is a motherboard problem. Check the voltages across R147, R148, and R149 when the backlight is operating. They should be equal. If they are not, replace the corresponding switch transistor (Q13, Q14, or Q15, respectively) with a generic NPN low power switching transistor (2N3904 equivalent).


Do all led's light up ?

Test: Open the the screen. (2 tiny screws) Reconnect cables, boot up the XO. Some led's light up, some don't.

Fix: Resolder broken leds.

Audio and Video problems

A stereo sweep pattern is output for testing the speakers/headset jack when running the Open Firmware hardware tests. A microphone test, where audio is recorded, then played back (at low volume) is also included. See Hardware Self-Test.

No sound through the speakers

Garbled sound through the speakers

No sound through the headphones

If the speakers function correctly, this is caused by the headphone connector.

The microphone doesn't work

The microphone jack doesn't work

Images from the camera are black

Keyboard and Touchpad problems

Does the keyboard generate unexpected responses ?

  1. Stuck keys
  2. bad connected cables

Does the touchpad only allow vertical (horizontal) movement ?

USB problems

A full set of eye patterns is generated for testing the wiring of each USB port when running the Open Firmware hardware tests. See Hardware Self-Test.

One of the USB ports doesn't work

This is typically due to broken connector or solder joint at the affected connector.

All of the USB ports receive power from a common switch, so if there is a problem with power it is likely due to the above reasons.

None of the USB ports work

Measure the +5V supply at the USB connector (or across C509 or C551). If it is not present, the USB power switch and current limiter, U56, may be broken. It may be bypassed by a 0 ohm resistor at R466 for testing purposes.

If the audio (speakers) is also non-functional, then the problem is the +5V supply on the motherboard.

Power and Battery problems

The laptop won't run using the battery

The laptop won't run using the power adapter

The laptop emits a high pitched whine when using the power adapter

The laptop can't charge the battery

  1. Indication: The charging led is burning yellow and never goes green.
  2. Debug:
    1. check if battery connector in the chassis has 7 volts.
    2. check if battery connector on the motherboard has 7 volts.

(Garbled text here!) < < < < < < < <

    1. If the has If This is due to a loose cable / connector. Check wiring and connectors st
  1. fixes:
    1. bewing pins in the connectors
    2. resolder / replace wires

Common Procedures

Rebooting the Embedded Controller

The XO embedded controller (EC) occasionally becomes confused. To reset it, remove all power sources from the laptop:

  1. Take the battery out and remove the power adapter
  2. Wait 10 seconds to allow the embedded controller to lose power and reset
  3. Replace at least one source of power (battery or power adapter)

The battery LED should flash orange momentarily (about a quarter of a second) when power is first reapplied. If you do not see this flash, you either have a motherboard hardware problem or faulty EC firmware installed. The general solution to both of these problems is to replace the motherboard.

Boot options

Different boot options are available through pressing buttons around the screen during the initial boot process (immediately after pressing the power button). These are included here for completeness. More information is available at Cheat codes.

  • '✓' (check) game pad key: forces a more detailed display while booting. See Startup Diagnosis for more details. This is useful for debugging activation problems.
  • 'O' game pad key: alternate between the current boot image and a previous one. In laptops which have never been upgraded, there is no previous boot image to use.
  • Rocker left: invoke hardware self-test.

Hardware Self-Test

Open Firmware includes hardware diagnostics routines for most major components of the laptop. It is triggered by pressing down the left hand side of the "Rocker switch" to the left of the screen while booting a laptop.

The following components of the laptop are tested sequentially by the hardware diagnostics:

  • Battery - The current status of the battery is read from it and printed out
  • SPI flash - The manufacturing data of the laptop is printed out
  • Memory - The SDRAM on the motherboard is quickly tested.
  • Processor - The Processor is exercised. Press any key to skip to the next test.
  • USB - The USB ports are exercised (for use with an oscilloscope).
  • Audio - A stereo sweep is output over the speakers (headphones, if plugged in) and then audio is recorded using the microphone, and output (and low volume) over the speakers.
  • Camera - Video is displayed on the screen from the camera for twenty seconds
  • SD Storage - Any SD storage is quickly (and non-destructively) tested
  • NAND Flash - The motherboard's internal NAND Flash storage is quickly (and non-destructively) tested.
  • Display - The display is only marginally tested with color bars, then the drawing capabilities of the CPU are displayed for a while. Press any key to skip to the next test.
  • WLAN - The firmware is loaded, the network co-processor booted and communicated with.
  • RTC
  • Timer
  • Touchpad - Press any key to exit.
  • Keyboard - Press ESC to exit.

If using firmware later than Q2D08, you can pause between individual tests by holding down the "rotate" button on the left hand side of the screen (below the "Rocker switch").