XO LiFePO4 Recovery Procedure

From OLPC
Revision as of 09:42, 17 October 2014 by Jeremy.mcmillan (talk | contribs) (Upon failure)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
  This page is monitored by the OLPC team.
  english | español HowTo [ID# 296771]  +/-  

This is a repair procedure for XO laptop batteries, and is part of the Troubleshooting and Repair Guides. The problem symptom addressed by this procedure is often called "the charge balance problem" by those in the OLPC community.

Problem Symptoms

A small percentage of LiFePO4 batteries are exhibiting a problem where they do not charge correctly. The parameters that the Embedded Controller (EC) watches while charging change in such a way that the EC thinks the battery is full and marks it as full. On some batteries this seems to happen in the first few minutes and on others it happens much later on. The end result in all cases is an abrupt shutoff when the voltage on the battery drops below the critical level where it can sustain operation. This is currently believed to be caused by a cell imbalance but full analysis is still pending.

Most (but perhaps not all) of the affected batteries have a serial number where the first 3 digits are less than 008. You can read the serial number from a label in the center of the underside of the battery labelled "S/N."

A label to the left side of the battery states the manufacturer of the battery (e.g. "Made by BYD Company Limited") in small text. The procedure on this page is intended only for BYD batteries, however. we believe that batteries of the other type (GP/GoldPeak) were only shipped in small quantities with XOs for early pilot projects.

Verifying the problem

The first step in verify if this is the problem with your battery is to check what the current capacity is. The tool to do this is olpc-pwr-log. You must download this script from http://dev.laptop.org/~rsmith/pwr_scripts/olpc-pwr-log and transfer it to your XO. One way is to save that link on another computer and transfer to the XO via a USB flash drive or SD card. Or you can download it directly on the XO by running wget in a Terminal Activity:

wget -O olpc-pwr-log http://dev.laptop.org/~rsmith/pwr_scripts/olpc-pwr-log  

(*NOTE: This is a capital O not a zero, and pwr-log is hyphenated, not underscored)

There have been a few reports of windows corrupting the file when it downloads it. If you download from a computer running a Windows OS then please get the .zip version instead and extract it to your storage media. olpc-pwr_log.zip is located in the same place as olpc-pwr-log.

Once it's downloaded and on the XO, then make sure that it's executable by typing

chmod +x olpc-pwr-log

Then you run the script with

./olpc-pwr-log

After a battery is detected, a series of numbers will start listing on the screen. See XO Power Draw for the details.

Basic procedure

  1. plug up your XO to external power and allow the battery to charge until the EC thinks it is full ie. green charge LED lit.
  2. start olpc-pwr-log and let it report at least 1 line of data
  3. unplug external power
  4. let the XO run on battery until it dies
  5. remove the battery
  6. power the XO back up on external power without the battery.
  7. start olpc-pwr-log
  8. insert the battery
  9. allow the battery to charge until the charge LED is green again.
  10. exit the script with 'ctrl-c'

Exceptions

In many cases the battery despite being very low will have marked as fully charged by the EC. If this is the case then the EC will not attempt to enable the charging circuits. Yet if you unplug external power the XO shuts down immediately or within a few seconds and the battery is not marked as low. In this state its not possible to obtain any data with olpc-pwr-log. To fix this the Open Firmware commands in batman.fth allow you to reset the state of the battery and manually mark it low. This allows the EC to resync with the actual state of the battery and allow charging to be enabled. See batman.fth in XO Troubleshooting Battery for the details.

Reading the log file

The numbers we care about for for diagnosing this problem are SOC (column 2), Voltage in uV (column 3), Current in uA (column 4) and Net ACR in mAh (last column). In batteries that are suffering from the "won't charge" problem the CV point (7.4 Volts) of the battery is reached much too soon. Then the voltage continues to rise and when it reaches above 7.5V the over voltage shutoff of the battery is activated. The EC senses that the current has dropped below the level that indicates the battery is full and marks it full even though very little charge was delivered to the battery. This pattern is fairy easy to recognize. Below is an example:

1214213331,72,6524560,1418229,3192,898,Charging,72,0

time passes

1214213649,76,7116870,1293098,3544,1185,Charging,76,119
1214213659,76,7171770,1280208,3560,1194,Charging,76,123
1214213670,76,7241920,1266536,3578,1203,Charging,76,127
1214213680,76,7511540,130,3589,1205,Not charging,76,127
1214213690,76,7507880,260,3576,1205,Not charging,76,127
1214213701,76,7511540,0,3567,1205,Not charging,76,127
1214213711,81,7506660,130,3551,1205,Not charging,81,127
1214213721,81,7512150,260,3539,1205,Not charging,81,127
1214213732,81,7510320,130,3521,1205,Not charging,81,127
1214213742,81,7510320,390,3508,1205,Not charging,81,127
1214213752,81,7506050,390,3493,1205,Not charging,81,127
1214213763,97,6745990,390,3477,1205,Full,97,127

In the above log the charging current is running along line normal at about 1.2 amps but at time 1214213680 the voltage suddenly jumps to 7.5V and the current falls to 130 uA which is down in the sensor noise. The EC first switches to CV mode (time 1214213711) and then when in CV mode and the charge current is < 200mA the EC marks the battery as full. This happens in the space of 114 seconds (1214213763-1214213649). This is much to quick. A normal charge would take a lot longer. The key measurement however is the net ACR which is the last column. It shows that across the entire time olpc-pwr-log was running the charging system only delivered 127 mAh. The LiFePO4 batteries are rated between 2800 and 3100 mAh so for a full continuous charge this number should be at least 2800 mAh. Anything less for a new battery means that your battery is not charging to its rated capacity.

If you batteries are old then a slight decrease is normal. Each battery charge/discharge cycle decreases the capacity of a battery a tiny bit. The OLPC batteries are rated for at least 50% remaining capacity after 2000 cycles. This means if you discharge/charge your battery every day then you will lose about 9% per year.

If you have any questions about the data reported in the olpc-pwr-log then open a help ticket by sending e-mail to help at laptop dot org with your log file attached and someone will help answer your questions.

Recovery Procedure

By using a slow charge procedure it appears to be possible to recover the problem batteries. The long term results of this procedure are still unknown, which is why it is still considered experimental. The procedure itself is simple but since each case is unique it requires the user to watch for some conditions to be met and for other conditions which require a restart of the procedure recording the results of each stage. It may also take a long time (up to 40 hours).

The procedure plays with the charging system in a way that limits the amount of current used to charge the battery. The idea is to slow charge the battery (from the capacity where charging would normally end prematurely) and keep the voltage from rising above the over voltage trip point. This allows the unbalanced cell to charge up while not damaging the cell that is fully charged.

You start the procedure with the battery fully charged according to normal means (plug in power cable, wait for battery LED to go green). At this point, you know how much charge is in the battery, because you measured its capacity above (using olpc-pwr-log). Note that sometimes there is very little difference between fully charged and empty, e.g. the battery in the example above is so screwed up that empty to fully charged was only 127mAh (a fraction of the rated capacity).

Procedure

  1. obtain a developer key and unlock the XO
  2. charge your battery normally
  3. remove the battery
  4. power up the XO and go to the 'ok' prompt
  5. fload batman.fth (for details see XO Troubleshooting Battery#batman.fth) (note: you do not need to run any of the command on that page other than loading batman.fth)
  6. insert the battery
  7. run bat-recover
  8. watch and wait

At this stage the screen should start streaming lines of data. This data is similar to the data reported by olpc-pwr-log but it being read directly from the the gas gauge rather than reported from the EC. The fields are:

  1. Battery Temperature
  2. Battery Current
  3. Battery Voltage
  4. Net ACR

Each value has the units after it. Current, Voltage and Net ACR are the values that need to be monitored.

The overall objective is to charge the battery so that approx 3000 mAh (or more) has been delivered. In your calculation, you must include the amount you believe that was delivered to the battery through the regular charge. So, in the example above, you would hope that bat-recover can deliver a net ACR of 2873mAh to the battery, because you know that 127mA has already been delivered (3000 - 127 = 2873).

A better example to show the importance of this calculation would be a not-quite-so-faulty battery where you have determined the battery capacity to be 1000mAh and have delivered that much charge, so you would now be hoping that bat-recover can deliver a further 2000mAh as reported in the Net ACR field (3000 - 1000 = 2000) to bring the battery up to rated capacity.

A perfect bat-recover run would be a continuous run where the slow-charging happens until the Net ACR field (and your own calculation) indicates that the battery has reached 2800 mAh or more, and at about the same time, the voltage should raise to 7.4V causing the charging to stop. The current should also fall below 100mA at this time.

However, there are some conditions to look out for:

  • Occasionally, the over voltage circuit will be tripped prematurely, causing charging to stop. This usually happens as the voltage approaches 7.4V.
  • The charging current may drop to a zero or near-zero figure. At this point, the battery is charging, but only at an obscenely slow speed.

In both cases, note down the Net ACR that has been delivered, reset the XO, and restart the procedure. You should modify your goal for the Net ACR figure to account for the amount of charge that was delivered in the previous run.

Because the batman.fth recovery procedure has to completely disable the embedded controller while running all the buttons and the keyboard are disabled. Including the power button. The only way to stop or reset the procedure is to remove the battery and external power.

If the problem battery is continually tripping the over voltage without getting anywhere near 3000 mAh of charge then please e-mail OLPC (help at laptop dot org) with the summary of your results.

Once the procedure is completed use olpc-pwr-log to measure your available capacity by doing a discharge and re-charge cycle with the EC operating normally.

Example

  1. We suspect a cell-balance problem in our new XO battery due to unexpected, abrupt shutoffs.
  2. We use olpc-pwr-log while charging from empty to confirm that regular charging is only delivering 1320mAh to the battery before the battery is marked as full. This is much less than the rated capacity of 2800-3000mAh.
  3. Without discharging the battery, we start bat-recover. Our goal is to deliver approximately a further 1680mAh to the battery using bat-recover (3000 - 1320 = 1680).
  4. Some time later, the charging current drops as low as 10mA so we decide that we need to reset the laptop. First, we note down the Net ACR as reported by bat-recover, which is 430mAh.
  5. After resetting the laptop, we run bat-recover again, with a goal of delivering a further 1250mAh to the battery (3000 - 1320 - 430 = 1250).
  6. Some time later, the charging voltage gets too high and charging stops. We note down the Net ACR from this session of 660mAh, then we reset the laptop.
  7. After the reset, we once again start bat-recover. Our goal is now to deliver 590mAh (3000 - 1320 - 430 - 660 = 590).
  8. Some time later, the voltage gets too high and charging stops again, but the Net ACR from this run is 525mA. This means we have now delivered 2935mAh to the battery (1320 + 430 + 660 + 590 = 2935). This is an acceptable figure, so we end the process here.
  9. Finally, we discharge the battery to the point where the laptop shuts off, then we use olpc-pwr-log to repeat the capacity test described at the top of this page. This time, the Net ACR field from olpc-pwr-log reports that 2972mAh was delivered to the battery during a regular charge. Again, this is a very satisfactory number, and at this point we can be happy that we have rebalanced the cells and that we are achieving the rated capacity of the battery.

Other options

A previous version of this page indicated that you should fully discharge the battery before starting bat-recover. You would then use bat-recover to try and deliver the full 3000mAh. While we do not believe this to be necessary, you might consider trying it if the above procedure fails to increase the battery capacity to normal levels. Note that bat-recover may take up to 40 hours to deliver 3000mAh.

A user following the above instructions reported that after fully discharging the battery, charging to 50% (with the usual means) then starting bat-recover, he was able to restore the battery capacity in just 12 hours (much quicker than using bat-recover on an empty battery).

Upon failure

If the above bat-recover procedure fails to deliver charge to the battery, it suggests that the battery's internal resistance is too high to receive charge. Repeating the bat-recover process may work, or an expert with proper equipment may be able to recondition the battery by disassembling and "bottom balancing" the cells if one cell is charged up (high resistance) but the other is dead (unable to produce enough current). This may be the case if the battery will accept some charge but capacity is still unusably limited after bat-recover. If the battery will not accept a charge at all, at least one of the cells' chemistry is destroyed and should be treated as broken and unrecoverable. Good cells from batteries with one failed cell may be paired and balanced for use in rebuilt batteries.