Per-Activity Power Usage

From OLPC
Revision as of 21:26, 16 April 2008 by Bemasc (talk | contribs) (Finished first version)
Jump to navigation Jump to search

Background

As OLPC approaches its goal of highly effective power management, it will eventually be the case that different Activities cause the XO to draw different amounts of battery power. There has been some suggestion that users may wish to know how each Activity affects their battery life. By providing users with feedback about which Activities draw the most power, we may help them to extend their battery life, and also apply appropriate pressure on developers to increase the efficiency of their Activities. There might even be security benefits, in the case of activities running hidden spambots or password-crackers.

There are two main questions that a user might ask about each Activity: "how much power does it tend to use?", and "how much power is it using right now?". We may call these "historical" and "instantaneous" power usage.

The proposed power management systems derive most of their effectiveness from turning off the CPU when software does not need it. The system provides several important diagnostics for power management. The first is a Coulomb Counter, which allows the system software to determine how much charge has been drawn from the battery. The second is a wakeup history, which allows us to determine when the CPU was powered on or off.

Instantaneous Power Usage

Instantaneous power usage is not well-defined on a per-activity basis due to the multitasking nature of the OS. It would not be difficult to determine the instantaneous total power usage by, for example, sampling the Coulomb Counter at one minute intervals and noting the rate of discharge.

If the system uses aggressive power management, then there may be several excellent proxies for instantaneous per-activity power usage. The best such proxy might be a combination of CPU utilization and system wakeups. The details of how to combine these two depend on the suspend system's idle detection strategies. For example, if idleness is detected solely on the basis of the time until the next scheduled event, then the best strategy might be to assign the entire power usage during each CPU-on period to the activity whose wakeup event started that period, or to the active activity if the CPU-on period was triggered by the user. If the idle detection is based on CPU usage, then one might instead instead divide power usage among activities in proportion to CPU usage. There are various steps in between these two options; the best solution will probably require empirical fiddling.

Historical Power Usage

If instantaneous power usage numbers can be calculated with satisfactory accuracy, then the historical power usage can be calculated trivially, using simple techniques like exponential weighting. However, it seems likely that the instantaneous power usage statistics will not be especially accurate, due to the heuristic nature of the methods.

For historical power usage, we have a much more powerful tool: the Coulomb Counter. However, the Coulomb Counter only tells us aggregate system power usage. In order to determine the contribution of each activity to power usage, we must correlate the activity use history with the power usage history. This requires a model of how activities interact with regard to power usage.

The Simplest Model: Additive power

The simplest model for power usage is that Activity <math>i</math> draws a wattage <math>w_i</math>. We let <math>w_0</math> represent the power used by the base system when no activities are running. If the set of running Activities at a given moment is <math>R</math>, then the total power usage <math>T = \sum_{i\in R} w_i</math>.

Whenever the user starts or stops an activity under battery power, the system should make a log of the change in coulombs since the last reading, the time since the last reading, and the list of activities that were running since the last reading. Multiplying by the system voltage, each such record provides an equation for the number of Joules used: <math>J = \sum_{i\in R} w_i t</math>, where <math>t</math> is the time for that interval. To combine all of these measurements, we create a sparse matrix <math>A</math> with one column for each Activity and one row for each reading. An element <math>A_{ij}</math> is 0 if Activity <math>i</math> was not running during reading <math>j</math>. If Activity <math>i</math> was running during reading <math>j</math>, <math>A_{ij}=t_j</math>. We also create a column vector <math>b</math> with one entry for each reading, with <math>b_j</math> equal to the number of Joules used during reading <math>j</math>.

We may now formulate the wattage problem as <math>Aw = b</math>, where w is a column vector of wattages per activity that we wish to know. The matrix <math>A</math> is nonsquare, so this system cannot be solved by inversion. However, there are many well-known fast techniques for determining the least-squares solution to an over- or under-determined linear system, such as Conjugate Gradients on the Normal Equations (CGNR). Additionally, due to measurement error or modelling error, it is possible that the least-squares solution may produce negative wattages. To prevent this in an optimal fashion, we may restrict <math>w</math> to be all-positive and use fast Quadratic Programming methods to solve the restricted problem.

Problems

This model is linear and additive; the presence of multiple activities does not affect how much power each one draws. This assumption is not very good, especially for high power usage activities. If two activities would each draw full power independently, running them simultaneously will not double the power usage. This model also does not account for level of user activity, network environment, or any other factors that may cause variation in power usage. Nonetheless, it is likely to be quite effective at distinguishing between the Activities that tend to draw a great deal of power and those that do not.

One possible improvement would be to allocate <math>w_1</math> to the backlight level, <math>w_2</math> to the DCON, etc. This will improve accuracy somewhat, but does not solve the fundamental problems with this model.