Activity ring

From OLPC
Revision as of 18:20, 27 September 2007 by Danw (talk | contribs) (→‎Shared memory: mention PSS)
Jump to navigation Jump to search
The Activity ring in the home view.

According to the OLPC Human Interface Guidelines:

[T]he section of the ring that a given activity occupies directly represents the amount of memory that the particular activity requires to run, providing immediate visual feedback about memory constraints and exposing a means for resource management that doesn't require knowledge of the underlying architecture.

There are various problems with implementing this perfectly. The end result is: you cannot use the activity ring to debug the memory usage of an activity. The details are explained below.

Shared memory

Most activities share a lot of the same code (the python interpreter, the gtk and dbus libraries, etc). Fortunately, the kernel will only load a single copy of each of these binaries/libraries into memory, rather than loading a separate identical copy for each process. Unfortunately, this complicates the activity ring metaphor, because it means that much of the memory in the system doesn't neatly belong to a single process.

Currently the ring does this:

  1. Any shared memory mapping which is used by the sugar shell is completely ignored by the activity ring. So, eg. since libgtk-x11-2.0.so is used by the shell even when no activities are running, we consider all of its memory to be used by the shell.
  2. Any shared memory mapping used by more than one activity (but not the shell) is evenly divided among the activities that share it. So if you have 3 activities sharing a 9MB library, the ring pretends each of them is responsible for 3MB of that memory. If you then close one of the three activities, the 9MB would be re-split so that now each of the two remaining activities is using 4.5MB of it.

This means that starting or stopping activities will usually result in small adjustments to the amount of memory claimed by other activities (although we try to hide this fact by delaying those small adjustments until the ring is no longer visible on screen).

(This measurement is similar to the PSS measurement in recent -mm kernels, except that PSS divides shared library mappings among all processes, not just activities, and of course, it doesn't special-case mappings used by the shell. It's unclear whether that comes out to "more accurate" or "less accurate" overall, but given how much simpler/faster to calculate it is, we probably want to use it once we ship a kernel that provides PSS.)

Other possibilities for shared memory handling

There is no way to make this problem go away completely; the memory really is being shared by multiple processes, so there's no way we can reliably pretend that it isn't:

  • If we divide it evenly among them (like we do now), we get the small fluctuations.
  • If we instead assign all of the memory for a shared mapping to the first activity to use it, that would badly distort the wedge sizes, and would end up causing even larger fluctuations when it was necessary to reassign the memory to a different activity when exiting the first one.
  • If we just ignore the memory used by shared mappings altogether and have the ring only represent private memory mappings, we would misrepresent the amount of memory used by activities that have a "shared" library that no other activity uses (eg, libabiword, libxul).

However, there are a few things we could do to squeeze some extra accuracy out if needed:

  • In addition to subtracting out shared mappings used by the shell, we could subtract shared mappings used by the datastore, shell service, and other background non-activity processes. (But it's not clear that these other processes use many mappings that the shell itself doesn't.)
  • Alternatively, we could just make the shell import libraries it doesn't actually need, but which are used by the majority of activity processes.

Non-activity memory

The ring shows the memory used by activities, and the free memory, but does not show the memory used by non-activity processes. Thus, the total amount of memory represented by the entire ring is not constant, and thus the amount of memory represented by any particular percentage of the ring is not constant.

This means that an activity using a constant amount of memory may nevertheless grow or shrink in the ring, based on the memory used by non-activity processes.

In particular:

  • Starting a non-activity process (such as the developer console, or dhcpd) will reduce the amount of RAM available to activities, causing the free memory wedge to shrink, and thus causing the activity wedges to grow. (Likewise, stopping a non-activity process will cause activity wedges to shrink.)
  • If a non-activity process, such as the shell, the datastore, or a system daemon, has a memory leak, the effect on the activity ring will be that all of the activity wedges will slowly grow.
  • If an activity has a "helper process" of some sort, that process's memory usage won't be accounted for properly.

Other possibilities for non-activity memory handling

The ring could be redesigned to include the "system memory" as another wedge. Then changes in the amount of non-activity memory used would just make the system memory wedge grow, and the free memory wedge shrink, and the activity wedges would be unchanged. This is probably a bad idea though; regardless of how much memory is being used by the system processes, making it be constantly visible would probably result in lots of complaints about it being too much.

If there are activities that need to use helper processes, we should come up with a way for them to indicate that to the shell, so that the shell can include the memory used by the helper as part of the activity's memory.

Minimum wedge size

If wedges in the activity ring got too small, the icons would have to be shrunk, and eventually you wouldn't be able to recognize them. Also, it would just look bad.

Currently, the activity ring doesn't let wedges get smaller than 1/10 of the ring. This causes three problems:

  1. Although you can distinguish "big" activities from "huge" ones, you can't distinguish "small" ones from "tiny" ones; they both take up the same minimum wedge size.
  2. Since small activities take up more than their fair share of the ring, this forces larger activities to take up less than their fair share, meaning that existing wedges may have to shrink slightly to accommodate a newly-added minimum-size wedge. (Although as with the shared memory fluctuations, we try to hide this by making the adjustment when the ring is not visible.)
  3. As the number of activities gets larger, the amount of the wedge taken up by minimum-size activities increases, leaving less room to distinguish the sizes of the other activities. (When you are running 10 activities, each will take up exactly 1/10 of the ring, and you have no information at all about their relative sizes.)

Other possibilities for minimum wedge size handling

At the moment, things work reasonably well on the XOs. Under sugar-jhbuild things are worse, because non-XO machines will generally have much more memory (RAM + swap), and so the free-memory wedge tends to dominate the ring. If future XO models have more RAM (and don't have correspondingly more bloated activities), the same thing would happen there.

One fix would be to make the ring larger in diameter, and possibly also push the icons further towards the outer edge, so that a standard-sized icon could fit into a smaller (percentagewise) wedge.

OTOH, if you have lots of memory, you have less need to know the exact size of activities, so the loss of accuracy wouldn't really matter.

Other issues

Swap

Although the XOs have no swap, the kernel can still "swap out" pages of binaries and shared libraries (and icon and font caches, etc), because they are always available on disk, so it can load them back in later if it needs them. This means that sometimes even if there is almost no free memory, it is still possible to start another activity, because the system can just swap out pages from less-recently-used activities to free up some RAM.

The kernel keeps track of whether pages are "active" or "inactive", where "inactive" means that the page is eligible to be swapped out if needed. If it exported this information to us in some useful way, we might be able to use it to come up with a better estimate of free memory. (I'm pretty sure the Inactive field in /proc/meminfo is not usable, because it includes inactive dirty pages as well, which can't actually be swapped out on the XO.) Presumably we would add the inactive pages to the free memory wedge, but not remove them from the activity wedges? Either way, this will cause fluctuations in the ring.

Another, somewhat drastic, possibility would be to build the XO kernel with swap completely disabled. This would mean we'd have less usable memory, but in return we'd get more predictable/explicable behavior.

Memory used by other processes on behalf of an activity

In some cases, an activity may cause some other process to use more memory. Eg, dbus-daemon needs to do some amount of bookkeeping for each dbus service run by an activity, and activities may cache pixmaps and other resources in the X server. (Firefox in particular is notorious for using lots of X server memory, although the Browse activity appears to be much better.)

We don't currently track this memory at all. If it turns out to be an issue, we could find ways to account for it. (Eg, the X-Resource extension lets us find out how much memory the X server is using on behalf of each client.)