9.1.0 requirements: Difference between revisions

From OLPC
Jump to navigation Jump to search
Line 348: Line 348:
GUI for managing network connections.
GUI for managing network connections.


Better default network connection choice algorithm (e.g. if USB - Ethernet is connected, use it first). This algorithm should be "smart" in its firsdt choice but should also be adjustable in the GUI.
Better default network connection choice algorithm (e.g. if USB - Ethernet is connected, use it first source: A Callahan). This algorithm should be "smart" in its firsdt choice but should also be adjustable in the GUI.


== Unadorned and unedited user feedback ==
== Unadorned and unedited user feedback ==

Revision as of 09:05, 14 August 2008


Release 9.1.0 Overview

This is a time based release per the process at:
Release Process Home

The process is not final. It is a set of rough guidelines still being worked out and subject to change.


Technical Strategy

Faster, more robust, better integration of activities, more collaboration...

See requirement definition at Requirements

Sales - Deployment Strategy

See customers below.

Pedagogical Strategy

....

Primary Release Drivers

Date The goal (not confirmed) is to make this release public sometime between December, 2008 and June, 2009.

Customers

Features


Target Deployments

See: User talk:Gregorio#Shipment Quantities and Languages

Uruguay

Spanish
100K x XOs?

http://wiki.laptop.org/go/OLPC_Uruguay

Peru

Spanish
100K x XOs?

http://wiki.laptop.org/go/OLPC_Peru

http://wiki.laptop.org/go/Peru_activity_pack

40K XOs coming from 656

100K XOs probably coming from 708 (exact build tbd) Gregorio 10:46, 31 July 2008 (UTC)

Mexico

Spanish
50K x XOs

Mongolia

Mongolian?
20K x XOs

Rwanda

French 10K x XOs

Haiti

Kreyol
13K x XOs

Ethiopia

Amharic
5k x XOs

Cambodia

Khmer
1000 XOs

Afghanistan

Dari
3000 x XOs

Thailand

Thai
500 x XOs

India

Devanagari
500

Brazil

Portuguese
200 x XOs

Arabic

Sabra and Shatila (Lebanon)
500 x XOs

Oceania

Enlish?
500 x XOs

Italy

Italian
600 x XOs

Turkey

Turkish?
15k x XOs

Senegal

French
1000 x XOs

Argentina

Equitorial Guinea

Panama

Spanish

Birmingham

English

South Carolina

English

New York City

English

G1G1

English, other?

China

?

Pakistan

?

Priorities from Deployments

Should list what, who and why on each item. Should link to detailed requirements and use cases eventually.

See User talk:Gregorio#Priorities from Carla

Activity Related Work

Requests from Birmingham.
New activities:
Firefox, Flash, mplayer
A spreadsheet (I'm going to use gnumeric until Socialcalc is ready for production)
A more full featured pdf reader (yes, I know there's the Read activity, but I'm planning on installing the full evince and making sure Firefox can open pdfs with it)
A document editor the user can insert images into (the full Abiword would be great for this, but the current workaround [after editing a file] is to open write, then hit ctrl+n)

General requests:
Make it easy to sugarize third party applications.

Support Java in Browse. Needed to see Scratch web site

Longer Battery Life

Requirement Definition

A list of UI actions which may affect power usage: Requirements#Power Management Requirements

Background on power draw including a break down watts used in different modes: http://wiki.laptop.org/go/XO_Power_Draw#The_Numbers

Our job is to increase battery life in as many modes as possible.

Suggestions from John/Gnu for increasing battery life follow.

Significant technical actions that could reduce the laptop's power draw:

  • Allow users to disable the mesh: reduces power consumption of the WiFi chip, allows the WiFi chip to enter power-save mode, and most importantly, allows lid-closed suspend to totally power off the WiFi chip. #6955
  • When the lid is closed, and the mesh is off (Wifi could be off or on), the laptop should totally power off the WiFi chip. This would lengthen the duration of lid-closed suspend (before totally draining the battery) from about 8 hours to several days. WiFi with an access point rather than mesh is now the default configuration in most schools, so this would have a dramatic impact on most deployments, particularly those with off-grid power. #7879
  • Put the EC into deep sleep when in a lid-closed suspend. This will cut its power from about 60ma to about 20ma (check with rsmith), which would further increase the duration of a lid-closed suspend before the battery drains.
  • Speed up Resume, which currently takes more than 1000 ms. A cheap shot at this is to unload the USB bus modules. The Extreme power management setting was supposed to do this ("USB-disabling") but it doesn't yet. This would make auto-suspends much less visible to the interactive performance of the laptop, probably cutting the resume time to 500ms, which would let us enable auto-suspend in more circumstances. Beyond this cheap hack, we should actually fix the USB drivers so that you CAN be using the USB -- the wifi/mesh in particular -- and still not hang for half a second uselessly on every resume.
  • Fix serious bugs in resume and networking. These have so far prevented us from turning on autosuspend by default for three major releases (650, update.1, and 8.2.0), and in each release, we decide at the last minute that we can't fix these bugs because it's too late in the release cycle and we didn't fix them earlier. Multicast, ARP, mDNS, the Presence Service, and collaboration must all work in the presence of autosuspend. Much work has gone into fixing these, but neither the final nits, nor polish, nor system-level testing in a network testbed, has occurred.
  • Reduce packet traffic during sharing -- particularly multicast traffic, which loads many machines (and wakes them up from autosuspend). This will require diagnosis and improvement of the presence and collaboration protocols. Some work was done on this, after the high packet traffic melted the WiFi bandwidth in early deployments (e.g. turn on 30 laptops in the mesh, none of them can usefully get anything done), but our main response was to tell later deployments "Don't use the mesh, use access points" -- a significant reversal that we should keep working to fix.
  • Improve cpu idle detection. Currently we can't auto-suspend very often (only after >1 minute of no user interaction) because it's so heavy-handed. The cpuidle infrastructure in the kernel should be able to tell us when it's safe to suspend because no process wants access to the CPU for the next few seconds. Most of the pieces are there.
  • Reduce useless polling by kernel, daemons, and activities. There's still a major Python bug that causes multi-threaded pyGTK2 programs to poll uselessly once a second (#4680, #4677). Much work was done to close it, but there is still a small sprint required to get it done. This not only reduces power use directly, but also allows the system to suspend because it doesn't appear that activities need to do some work soon.

Who requested

Kim, Carla, John/Gnu

Touchpad

Requirement Defintion

Who requested

Improved in 8.2.0
Priority request from Carla

Collaboration

9.1.0 Collaboration Requirements Phase 1

Allow for intuitive mesh connection and activity-sharing.
Must support first two use cases defined at: Use Cases#Collaboration Examples

Requirements#Mesh / Connectivity_Requirements

Requirements#AbiWord Sharing Requirements

http://lists.laptop.org/pipermail/sugar/2008-July/007407.html

May need to include that link in developer section below, tbd

Collaboration Requirement Phase 2

Interschool, other use cases.

Who requested

Priority request from Carla and David.
Also several thread on OLPC-Sur.
Peru technical leaders

Inter-school communication

What I see as most missing and most necessary is a safe space for collaboration between students at different schools, even in different countries.
http://lists.laptop.org/pipermail/localization/2008-July/001249.html

Groups

4043

Performance and Reliability

Faster activity launching
6472

- File/activity open

- File save

- Task switching

- Activity or main GUI responsivesness to cursor

- Hardware

Suggestions from Marco

From: http://lists.laptop.org/pipermail/sugar/2008-July/007471.html

1 General UI sluggishness.

Without going too deep into the stack, I think there is a lot of space or improvement even in the python code. Michael mentioned icon cache and mesh view layout, both areas of the code that didn't get any love for a while and where I'm convinced we can make big/quick progresses. I'm planning to spend a lot of my time on this in the next release cycle.

2 Datastore performance.

I'd expect improving jffs2 performance would be a big win there. The UI can probably be smarter in the way it uses the datastore too Finally we can try to improve at the datastore level, but there we clash with the rewrite plans. I think we should find an agreement on the datastore design and just try and get it done for 9.1.0, there are two many things blocking on it.

We need to allocate time and focus to do some good profiling and then attack the problems we find. That's why I'm advocating for the next Sugar release cycle to be largely about perfomance improvements, bug fixes and code refactoring, with the possible exception of the datastore rewrite.

File Management

Candybag Idea for a new activity: Candy Bag. You open a bag (i.e. you launch the CandyBag activity), then you put journal entries in it, then sharing this activity means that your friends can grab a candy in your bag.
From: http://lists.laptop.org/pipermail/devel/2008-July/017459.html

See also point #3 at
http://wiki.laptop.org/go/9.1.0#e-mail_from_Ben_S

copy all files from the journal to a USB device with one keystroke
From Bastien

Allow listing, copying and finding files with standard Unix commands in the terminal view.
See also point #1 at: http://wiki.laptop.org/go/9.1.0#e-mail_from_Ben_S

Localization

Spell checking as part of l10n. (Sayamindu and Sur list)

Allow adding a language after the build is created and without OLPC intervention

Chinese support (Kim)

Better RTL support

Security Activation and Deployability

School server push of XO images

Requirement: The XO should be able to get the latest build from the school server

  • The administrator makes the desired build available in the designated

directory. When ready, the administrator requests to 'push' the build to all laptops as they come on line. Both of these activities should be an easy-to-use UI at the school server.

  • A test requirement is the ability to create a white list of serial numbers

that should be upgraded with the push.

Source: Peru

Activation lease security feature

Requirement - if the laptop is stolen, and doesn't contact its local school server within some period time (activation lease time); then it will tell the user that it will not activate on next boot and provide date and time.

Requirement - it is not possible to set the date on the laptop to keep it within the lease period or to force it to outside the lease management. This might mean you cannot change the date or there is no root access, or it might mean an alternate time source is used... (not trying to solve the problem, just want to note this requirement).

Source: Peru

6689 and 7878 Expand customization key capabilities.

XO Monitoring

Requirement: Provide XS database and an API so that countries can create reports and monitoring for various aspects of XOs:

  • Version of code
  • Which laptops are being seen each day
  • Total number of laptops being seen per day
  • Number of laptops accessing the internet
  • List of URLs being accessed
  • List of URLs per laptop
  • Which activities are being used per laptop
  • Number of minutes in each activity per laptop; can we determine (and subtract) idle time?
  • Number of minutes in each activity within and outside of school hours (perhaps this means we capture the 'start' time of each activity and allow the reporting to decide if this is during or outside of school hours).

Background and implementation ideas:
I think there might be some work in the XO to make the information available; and a database and API spec from the school server. This is not as high priority for their deployment as the passive lease management, but I believe this feature will be important for any deployment. We should try to get feedback from other deployments as to the information we want to collect.

Source: Peru

Customized Startup Images

Must allow changing startup picture by copying an image to the XO. This should not require any OLPC developer intervention and should not need a new or special build.

http://lists.laptop.org/pipermail/devel/2008-July/017299.html

Source: Colombia and Mexico before.

Already present in 8.1.0. See: http://wiki.laptop.org/go/Tweaking_the_boot_animation

Should allow setting the language at the factory or when upgrading/re-imaging. The idea is to not require the kids to click the control panel to get to their language. One idea is to include it in the choice you make on first boot (e.g. name and color) better would be to set the language at install.
Source: Bastien for Haiti.

Allow Key Customers to Sign Their Own Builds

Uruguay request

Network Manager Connections

G1G1 encryption.

802.11i (AKA 802.1x) Uruguay

GUI for managing network connections.

Better default network connection choice algorithm (e.g. if USB - Ethernet is connected, use it first source: A Callahan). This algorithm should be "smart" in its firsdt choice but should also be adjustable in the GUI.

Unadorned and unedited user feedback

6th grade kids feedback on build 656:
http://sextosdela37.blogspot.com/2008/04/analizando-el-uso-de-las-laptop-en-el.html

Priorities from Carla: User talk:Gregorio#Priorities from Carla

Comments from technical user in Ecuador:
http://lists.laptop.org/pipermail/olpc-sur/2008-July/000408.html

GUI and Usability

Request from Uruguay for HW alerts:
http://lists.laptop.org/pipermail/sugar/2008-July/007086.html

Designs ideas from Scott and Eben in thread. e.g. I hope our alert system will use the freedesktop.org standard: http://www.galago-project.org/specs/notification/index.php

More prominent help information - Brianne, G1G1
One idea from Eben. From: http://lists.laptop.org/pipermail/sugar/2008-August/007547.html
All of our initial discussions on help focused around a contextual help system, and I still hope that this is where we'll be taking this in the future. By embedding (?) icons within the secondary palette menus for various devices, objects, activities, and even individual buttons and controls, we can provide a way to launch into the help activity and dive directly to the relevant info for the activity, control, etc. selected. In addition, I'd like to support a community driven help system by which, in addition to the activity/olpc provided help, it's possible for kids to add tips, tricks, images, tutorials, and other info to these sections for later consumption by peers.

This is a noble, but ambitious goal, which is why a simple and static help activity is the present solution, and why it's only integrated into the system at a single point - the activity itself.

Show size of files in journal and in general make it easier.

Priorities from Engineering

Upgrade journal per: http://wiki.laptop.org/go/OLPC_Human_Interface_Guidelines/The_Laptop_Experience/The_Journal

Less flickering in starting activities and other screen changes - Michael

Fedora 10 rebase?

Faster Upgrade

Make it easier to upgrade many laptops in the field. Several cases:

Updates:

  1. XOs in warehouse need update before being shipped.
  2. XOs in school with no WAN
  3. XOs in school with thin WAN pipe (BW <= 1Mb/s)
  4. XOs in in school with good WAN (BW >= 1Mb/s)

Possibly sub-variants of in school case for Mesh, Wireless AP, XS.


Improved Clipboard

Here is a rough specification of the intended design for a usable multi-item clipboard. I'll add a list of related tickets shortly.

e-mail from Ben S

http://lists.laptop.org/pipermail/sugar/2008-July/007390.html

1. The datastore
Sugar's design calls for a centralized rich data storage system, the datastore. The datastore provides secure, limited file access to Activities, manages file metadata, maintains a differentially compressed history of all work, ensures reliable backups to a trusted server, and mediates the connection to removable media. Every one of these features is crucial to Sugar's functioning, and almost none are really working at this time. We cannot afford another release based on the present datastore, as it fails to implement the features we require, and is unreliable even in the features it supposedly implements.

Solution:
There have, at this point, been at least five distinct proposals for a next-generation datastore design, all differing in underlying implementation and user-facing functionality. We need to have a Once And For All datastore summit, draw up a compromise datastore design, and implement it. We can do this by 9.1.0, if we are willing to make it a priority.

Additional Links:

- Expose the file system on save, open and in the journal. Source: Marvin Minsky

2. OS Updates
We now have hundreds of thousands of laptops deployed in the field, running a variety of OS versions. OLPC cannot afford to support a multitude of decrepit versions, and children cannot afford to suffer defects that have long since been fixed. We need a reliable, fast, update system that does not rely on the network, so that children everywhere can move to the latest version of Sugar without losing their data. The update system must support tremendously invasive upgrades, like repartitioning the NAND and replacing JFFS2, because we expect to do this in short order.

Solution:
A secure usb autoreinstallation stick is required. It is not technically challenging to implement, but it must be made a priority, and then be made widely available and idiot-proof.

3. File Sharing
Students and teachers have no good way to distribute files directly from one person's Journal to another. If all Activities that open a file do not implement Collaboration, then there is simply no way to transfer that file over the network. This is the most basic possible network functionality --- FTP was standardized in 1971 --- but it is completely missing from our system.

Solution:
A number of technical proof-of-concept programs have been written for distributing files, using methods like HTTP over stream tubes and Cerebro's Teleport. There is an excellent set of [[Specifications/Object_Transfers|UI mockups for this] functionality. All that is left is to Get It Done.

Additional Links:

4. Activity Modification
A keystone of the Sugar design has always been the user's ability to edit any Activity, and to cement this a "View Source" key was designed right into the hardware. This functionality is simply missing, and that prevents us from making our principal claim regarding an emphasis on user modification.

Solution:
"Develop" must be polished, finished, and included by default. This will require modifications to the core system, in order to support an endless variety of slightly modified Activities. It will also require work on the Develop program itself. If volunteer efforts are not moving fast enough, OLPC must ensure that someone is working on the problem as a professional.

Additional links:

5. Bitfrost
Sugar, as it currently stands, is among the least secure operating systems ever, far less secure than any modern Linux or Windows OS. I can easily write an Activity that, when run by the user, escalates to root privileges and does anything I like with the system. Given Sugar's competitive status against Windows XO, this failing threatens the very existence of the project. The Sugar designs have long stated that safely running untrusted code from a classmate is a key goal for learning, but the current software accomplishes precisely the opposite.

Solution:
NO ONE IS WORKING ON BITFROST. That's right. Everyone who was working on Sugar security (after activation) has either left OLPC or moved into another role. Someone must be assigned to continue the security work, or it will certainly never make progress. Anyone who _does_ take on this challenge will start from a much better position than previously, because many of the Vserver features have moved into the mainline kernel over the last few versions. The kernel now contains a number of new, powerful isolation and control primitives.

6. Power management
Power management is the raison d'etre of the XO hardware. It is the reason that the hardware took four times as long to develop as a standard laptop, the reason that we suffer from the closed Marvell operating system, the reason that OLPC's best engineers flew around the globe fighting with details of voltage and capacitance. In an increasingly crowded low cost laptop market, it is one of OLPC's few remaining distinctions. As of 8.2.0, aggressive suspend is available, but its functionality is still far from the target.

Solution:
Enabling aggressive power management is a major challenge, perhaps more difficult than anything else on this list. We know what is required for a first step: ensure that we can reliably wake up from a hardware timer.

This single feature would be enough to enable a basic sleepy approach that is truly transparent to software. Other work includes removing USB from the critical path on resume. Aggressive suspend may not be ready for 9.1.0, but if no one is working on it it will never be ready.

Suggestions from CScott

  • Better "legacy app" compatibility; this should allow us to run existing apps like firefox, pidgin, and gimp in a reasonable manner:
    • Replace matchbox window manager with XMonad
    • Switch to standard freedesktop.org startup notification mechanism
    • Implement freedesktop.org notifications mechanism for alerts (low battery, low disk space, available software update)
    • Refactor home/friends/mesh view as operations on root window, so they make sense in a multiwindow setup.
  • Improved antitheft mechanisms:
    • ECO fix and EC improvements
    • Security control panel, with "am I stolen" and lease renewal buttons: <trac>1502</trac>
    • olpcrd work: <trac>7397</trac>
  • Improved update mechanisms
    • Real COW for pristine versions, allowing...
    • ...binary-diff updates over http (avoiding rsync in many cases)
    • Integration of core OS updates with software update control panel
    • Integration of software updates with notification system
  • Security/isolation work
    • Implementation of P_SF_CORE, P_SF_RUN
    • Mechanism to validate updates to loopholed activities & allow user to manage exceptions
    • Persistent activity storage
    • NSS modules for rainbow users and for olpc/root users
    • Rainbow
  • Journal improvements
    • Proper display of 'new versions' of activities
    • Tagging system to allow folder-like management for advanced users
    • Datastore rework
    • Direct execution of activities from datastore (avoid installation step) <trac>7713</trac>
    • Ability to have multiple versions of an activity installed concurrently. (API work necessary: <trac>7713</trac>)
  • Translation improvements
    • translation system should look in local, then activity, then system translation tables, then repeat for each in a set of fallback languages (eg, quechua, spanish, english)
    • separable translation packs
    • wiki-like editing of translatable labels in the UI

Replacement File System

The current file system used on the XO ([JFFS2]) was developed in the days of 32/64MiB NOR flash devices and does not scale well to the 1GiB NAND devices we are using today and certainly will not deal well with larger devices in gen 1.5 and gen 2 systems. The main issues are that JFFS2 needs to scan all blocks at mount time and that the garbage collection algorithm increasingly consumes CPU cycles as the system fills up. The first issue manifests itself as large delays at boot up to mount the file system as it is full (need to quantify this), and the second issue issue simply means less CPU cycles can be dedicated to user task processing. In the last few years, there has been much activity in the embedded Linux world on alternatives to JFFS2 and we need to investigate these to determine which one best fits our needs:

  • YAFFS2 : Has been around for several years and in active use in the embedded space. It does not provide compression out of the box as the devices that use it often carry already compressed data such as MP3 and MPG files.
  • UBIFS: A new file system recently merged into the kernel designed for use on large flash devices. Supports compression out of the box and supports disabling compression on a per-inode basis, which is something we desire.
  • LogFS : Another JFFS2 replacement designed for fast operation and scalability. Still fairly new and not in very active use AFAIK.
  • Btrfs : A very new file system that is primarily targeted for large storage systems with features such as snapshotting, RAID support, and online fsck. David Woodhouse has expressed that he will look into getting this to work on NAND, but it is too new of a code base to consider for 9.1
  • Managed NAND : (Not an option for existing XO machines, but might be a direction for future hardware) The industry seems to be moving away from raw NAND in favor of a NAND-chip plus microcontroller solution. The microcontroller implements a Flash Translation Layer that hides the low level NAND details, presenting a higher level interface that looks like an ordinary hard disk, so an ordinary filesystem layout like ext3 could be used. The driving force behind managed NAND is rapid hardware evolution toward multi-level-cell NAND chips and smaller process geometries, which change important NAND parameters like page sizes and ECC. The situation is similar to what happened with hard disks in the 80's, where raw-disk interfaces gave way to SCSI and IDE, which hide the ugly low-level details that changed every technology generation. Managed NAND options include LBA-NAND from Toshiba (pin-compatible with NAND chips), eMMC from multiple vendors (with an SD/MMC-compatible hardware interface), and various "disk on chip" products (IDE-compatible interface).

Development Process

  • Analyze and quantify the use patterns that trigger issues with JFFS2.
  • Gather requirements for a new file system that go beyond solving above issues.
  • Develop/integrate a set of tests for FS performance and reliability that take into account our use patterns, the above issues, and any other features we wish to validate.
  • Study the three main alternatives (YAFFS, UbiFS, LogFS) to determine what requirements they meet
  • Run our FS test suite on above alternatives
  • Analyze test results and requirements study and make a decision on which alternative to use.

Deployment Process

Once a file system is chosen from the above process, we will have to:

  • Integrate the file system into our kernel if it is not upstream (YAFFS2 or Logfs)
  • Integrate the file system support packages into our build system and file system images
  • Update our build scripts to generate the proper image type for

Activity Impact

This change should not have any impact on activity developers as it is at a very low level of the stack.

End User Impact

End users will see faster boot up time and will not see (as much) performance degradation as the file system fills up.

Key Modules and Relevant Module Roadmaps