Suspend and resume

From OLPC
Revision as of 05:50, 3 November 2007 by Gnu (talk | contribs) (Remove "Stub" indication)
Jump to: navigation, search

The XO can suspend and resume with little effort; as there are no moving parts to spin down, total time to resume can possibly be brought down to below 100ms. As of this revision, the time is dominated by the time spewing debugging messages out the serial port; when we last checked by turning them off, we were somewhere in the 200ms range.

The gen2 laptop should NOT attach its network via USB -- or at least not via an externally visible, powered, USB bus.) Short suspensions would allow for suspension between keystrokes for slow applications, and for aggressive suspend modes that conserve significant amounts of power.

To modify a B2 to do a partial job of suspending, see B2 Suspend ECR. Further fixes were made between B4 and C2.

Categories of Suspensions

There's a fair bit of confusion about suspend and resume. There are many different power-saving features, some of which are suspend/resume and some of which are not. There are different kinds of suspends. This page is an effort to bring some clarity to the subject.

The two fundamental categories of suspends are 'manual' and 'automatic'. Manual suspends are done on specific command by the user, e.g. by closing the laptop lid. Such suspends should not be resumed without manual action by the user (e.g. opening the lid). Automatic suspends are done by the laptop on its own initiative, without input from or consultation with the user. These should be transparent to the user -- the user should not be able to tell whether the laptop is suspended or not. They should be automatically resumed whenever the laptop needs to do something.

The two kinds of suspends should not be confused; but numerous bugs in suspend are reported because of this confusion. E.g. manual suspends are waking up when a packet arrives, when a key is hit, or when the battery charge changes.

Automatic Suspensions

The laptop should suspend automatically whenever the CPU and DMA activity is expected to be idle for a significant period of time. This is strictly a power-saving measure, and should not result in any user-visible changes to the screen or peripherals. Anything and everything that requires CPU attention should cause it to power-up immediately. Before suspending, the system will set a timer to ensure that it wakes up before the next scheduled event in the kernel or any user process.

Unfortunately, the USB bus must be powered down when the CPU is; and the network chip is attached to USB. Thus the chip has a separate "wakeup" wire to ask the system to resume when the network chip needs service. Another impact is that the system can't suspend when the USB bus is in use by an external device (unless it's a USB mass storage device and has been fully allowed to write any cached info and quiesce itself).

A further complication is that the USB implementation constantly does DMA, to poll the plugged-in devices, whether or not any real user visible I/O activity is happening. Thus, the decision to enter an automatic suspension must ignore the DMA caused by the USB idle polling, as well as ignoring the presence of the network chip on the USB bus, in order to successfully decide to suspend.

Because DMA must be inactive when the CPU is powered down (there's no bus arbiter), the CPU will not be able to suspend whenever audio input or output is occurring; nor when the video camera is in use. Pending reads and writes to flash, SD, or network will all prevent automatic suspend.

Automatic suspension is implemented with a new Linux kernel infrastructure called "cpuidle", which allows the CPU to decide what to do when it becomes idle.

Manual Suspensions

Once automatic suspensions are working, the only time that end-users would normally want to explicitly suspend the laptop is by closing the lid. It's conceivable that scenarios could be invented, such as when a USB device is plugged in and automatic suspend won't work, when the user might want to specifically suspend the laptop. But they can do so by closing the lid. The laptop should only resume when the lid is opened, or to do an automatic full shutdown when the battery is about to go dead. Pressing keys, moving the mouse, network packets should all be ignored (keys and mouse motions are ghosts, possibly from jostling or dropping the laptop). Indeed, the keyboard, mouse, USB, SD card, camera, and network should all be put into maximum power-saving mode (ideally with all power removed) when a manual suspension happens.

There should be no need for the convention used in e.g. build 622, that pressing the power button for a brief time causes the system to suspend. Not only does this free up the power button for another use, but it also eliminates significant confusion about what conditions should cause a suspended laptop to wake up.

Perhaps the power button should be repurposed for "suspend to disk" (i.e. suspend to flash), which would allow the system to go into a state that consumes really tiny amounts of power (for only the EC), without requiring a reboot.

Manual suspensions should in general not be necessary in ebook mode. If the book is not being looked at, it should be closed. Using the power button to suspend an ebook and then stuff it into a backpack is likely to wake it up from having the power button jostled. The main additional power saving available in manual suspend is the screen.

Testing Suspensions

The most important tool for testing suspension is a setup for monitoring the power consumption of the laptop. The "Tinderbox" was designed to measure power consumption, and various measurements have been taken with it, but there's no automated (e.g. nightly) testing that can tell us whether our software, as it evolves, is burning more or less power than it did yesterday.

Richard Smith suggests examining the couloumb counter in the battery to determine how much energy is used over time. (cat /sys/devices/platform/olpc-battery.0/power_supply/olpc-battery/accum_current; it is in bizarre units.) That's a great idea, but requires that the tests be run with AC power off. This is easy for manual testing, but might require additional infrastructure in an automated testbed.

For testing, it may be useful to retain a developer's ability to cause a manual suspension on command. If a configuration knob is set to enable this testing, the power button could continue to be used to suspend without closing the laptop.

Other Non-Suspend Power Saving Strategies

When the notebook is in ebook mode (and the keyboard and mouse are covered up), those peripherals should be unpowered, or should be put into their lowest possible power consumption.

The SD interface may require special consideration. The most common case is that there is nothing in the slot; this should not block automatic suspension, and indeed an empty slot should be almost completely powered down at all times, whether the CPU is suspended or not. Ideally the system should be able to detect an inserted card, but should spend almost zero power until a card is inserted. I do not yet know if the hardware is capable of this. If it isn't, perhaps inserting an SD card should require a specific software or hardware action by the user (e.g. pressing the power button, or clicking on something in the Journal).

The next most common case of an inserted but largely unused SD/MMC memory card can eventually be handled by letting it write its cache, quiesce, then silently powering it down. Resume its power later, whenever the software needs to access it. For increased safety of its file system integrity, the file system should probably be sync'd or quietly internally dismounted before powering down the slit (and quietly remounted at the first access; /proc/mounts would continue to show it as mounted throughout, but the file system structures on the media would be synchronized as if dismounted).

The same strategy would be fruitful for USB mass storage devices. They need not burn power forever when plugged in and used briefly.

USB non-mass-storage devices, such as network interfaces, serial ports, keyboards, mice, etc, may sit idle for a long time "unused" by the CPU and user software, yet have the potential to interrupt at any time from an external event. All of these devices would prevent automatic suspensions, and thus they cause battery life to drop dramatically. It may be useful to have a user-settable control which would allow such devices to be powered down when not in active use. If the user set this control, they'd be overriding the system's "safe" default decision to always keep the device powered on (and thus to avoid all automatic suspensions).