System software: Difference between revisions

From OLPC
Jump to navigation Jump to search
No edit summary
Line 78: Line 78:


** Since I assume each school will also have an uplink gateway, maybe the Offline version of Wikipedia could be put on that, and cached on the individual units as they access it?
** Since I assume each school will also have an uplink gateway, maybe the Offline version of Wikipedia could be put on that, and cached on the individual units as they access it?

*** The english wikipedia is around 1.4GB (text only). The other language wikipedias seem to be smaller. I suppose someone could go through and delete a large amount of useless information, but that would be a pretty big job. Still, if pruning useless content from wikipedia were to become a sort of community project, I don't doubt there would be a large enough amount of volunteers that wouldn't mind spending some time on it.


==== Grid computing ====
==== Grid computing ====

Revision as of 01:52, 24 May 2006

Software Ideas

System Software

A version of Touch Typing software to teach these kids to touch type, the faster you can work with a keyboard whatever age you are then the faster you can get on with solving the worlds problems and letting the world know about your solutions... ' eg unjustified government spending on military budgets that will eventully only lead to one thing, more War to justify more spending etc..."

And does someone not need to make clearer in your marketing for support of this project that it does not have to be the same person who turns the crank as types at the keyboard, that there is a shortage of electricity in developing nations not hands to turn cranks?

Operating System Selection

I would like to seriously question using Linux as the basis of the laptop. It's a path of constant tweaking and tuning. Linux is also further away from the "pure" microkernel architectures than the BSDs. In short Linux is like a F1 tuned for racing - but not as reliable. When working on a familiar hardware platform the BSDs become very robust and nice platform to work with. And more importantly without the horrible quality problems of the Linux kernel. FreeBSD kernel is also magnitudes more secure - it just is cleaner and better thought and developed with the focus first on stability and reliablity. Unlike Linux.. The bottom line is that the project has got the possibility to choose from others than just Linux as well. I'd go for Frisbee or mentioned reasons but there are others too.

  • (While I agree that *BSD might be a better option -- perhaps OpenBSD in particular for reasons of stability, et cetera -- I think the disparaging remarks about Linux-based operating systems were poorly conceived, exaggerated, and unnecessary as presented. - Chad Perrin)
  • Indeed: BSD is a good alternative. Another alternative may be an older Linux kernel. The 2.4 kernel line is pretty stable and good also. The current Linux kernel version is 2.6
  • (User-contributed) Yes, the older the Linux kernel is, the more stable it becomes. For example, the current 1.08 version of PuppyLinux is based on the 2.4 kernel, and can confidently deliver satisfactory GUI experience despite the resource constraints in the OLPC machine.


Why UNIX(like) at all?
Recently Negroponte stated that a "slimmer Linux" is needed for the $100 laptop. Like others here, I question why it has to be Linux. But I'll take that notion even further: Why does the OS even need to be Unix or a Unix work alike? Why is Negroponte so bent on Linux? There are other FOSS options that are much more focused on the desktop, yet still have Posix compliance.

  • Haiku-OS is an FOSS implementation of the BeOS tool kits with a FOSS kernel. It maintains binary and source level compatibility with BeOS R5, so all pre-existing free R5 software will work on it. (There is quite a bit)
  • Syllable OS Is a lot like BeOS. (But it's not BeOS) It's getting pretty mature.

Both these options are FOSS. If you really want the Unixy goodness, both also happen to have Posix compliant shells, so much of the Unix CLI stuff is portable. Both have fast native GUI systems and very modern design behind the internals. Both are extremely light weight compared to BSD or Linux + X + desktop env. And if one was really concerned with the stability of the kernel, it might be possible to run one of these alternative environments on top of a BSD or Linux kernel instead of their native kernels.

It might also be possible to buy out the BeOS R5 codebase from Palm Inc. or have them donate it to the project. They don't seem interested in doing anything with it. Then the OLPC project could release the the source under GPL and get help from the Haiku and Yellow Tab guys to whip it into shape for the $100 laptop. I really can't convey how great BeOS or Haiku would be project. It's a perfect match for a system with the $100 laptops specs.

Peer To Peer Distribution, for Electronic Text, Software, Email

Extending the original idea from below... this is more general then just about electronic text, though. In lack of a better term, let me call it "built-in support for non-real-time Internet connectivity", provided as shared service and usable by apps.

For example, I myself often read some web pages that I had downloaded while on the network at home while traveling, disconnected from a network, and of course when clicking on a link you get some stupid technical error message. Why can't the thing remember I want to read the linked page later and "queue" it somewhere? This idea is probably more much more relevant in some OLPC scenarios than it is for myself; what if you are connected to the "Internet by Motorbike" say only once every two weeks, as in the Motoman project in Cambodia?

This applies to many forms of data, from electronic content be it a complete ebook, HTML page, Email or some software to download - or publishing of content such as homepage or blog updates, etc. (I think OneWorld has an XML-based publishing along those lines; but could be confusing it with something else.) Making it possible (and easy!) to request, and publish, data from one device, which then forwards the reqest to another, and ultimately forward to Internet when connected. Doesn't it make you feel like good ol' FIDO Net is back?

Vorburger 20:06, 9 February 2006 (EST)

I believe one of the most useful purposes of the laptop will be to distribute electronic text (i.e. e-books). The distribution of any information without a persistent network connection will be difficult. I imagine a situation where 1 out of 100 or 1000 kids may have access to a network connection. The peer to peer network could be used to distribute e-books from that single network connection to 1000 kids. I'd like to propose the design of a peer-to-peer network client designed specifically for this purpose.

A simple, graphical language-localized client would be designed to present a catalog of e-books. The client would pull down a listing of books available in a certain age-range for a targeted language. The student would pick texts that he or she has interest in. This list of requests would consist of a very small packet of data, perhaps a unique identifier of the device and a unique identifier of the text. When the device sees another device, it would off load it's packet to the peer device and vice-versa. Each device would contain a listing of requests from all of the peers that it came in contact with. The next time a device connects to the Internet, it would pull as many texts as allowable by pre-defined memory limits (say 3-5 meg). As the device comes into contact with other devices, it would deliver the texts to the other devices. Hopefully, over time, the requestor would be delivered some of the texts originally requested. As each text is delivered, a delivery or cancellation notice would be sent back through the peer network.

The peer to peer network should gather performance intelligence over time. It should be able to guess which routes have better chances of making a request and returning a delivery.

If a proof of concept proves to work well, the peer to peer network would be extended to handle two-way communication for interaction such as email or the submission and grading of assignments.

--65.7.133.163 04:41, 27 January 2006 (EST) Jason Hoekstra - jason@solexinc.com


To continue on with what Jason and Volburger said, there is a fundamental conflict in OLPC. On one hand, you need to keep costs down and must therefore have small persistent storage. On the other hand, the purpose of the laptop is for learning, and learning requires the storage of information. What a child needs to learn with is really nothing more than an encyclopaedia and a simple way to navigate it, but it is probably not feasible to store an entire encyclopaedia in the available space, especially if you include multimedia, which you definitely should.

The approach suggested by Jason, which is not a bad one at all, is to retrieve information based on interest. I have some experience in mobile, ad hoc, sensor, and peer to peer networking, and what he's proposing is something similar to rumor routing and directed diffusion. There are a couple of problems with the suggestion, though. The first is that you will likely have a lot more requests for books than space to keep them. In a store and forward network with limited space and unpredicatable mobility, you are unlikely to hang onto enough books that you will be able to satisfy the requests. The second problem is that in all likelihood, some children will have far more Internet access than others, due to geographical location, being able to afford transport, or whatever the case may be. This may form a book distribution tree rooted at a few children with many children as leaves, which will cause significant distribution problems. Basically, a few children will need to store and forward books for a large number of children, and will quickly fill up their space without satisfying many requests. This depends on the particular country and situation of course, but in general, the branching factor and depth of the distribution tree can have serious repercussions on how many of the requests are satisfied.

I personally agree with Jason that a distributed, peer to peer filesystem is necessary. You may only have half a gigabyte of space, but there are a lot of half gigabytes running around, hopefully within a few hops of each other. So step one is that you need a routing protocol to form multihop ad hoc networks. There is a lot of literature on this; look at MobiHoc if you need a starting point. Step two is that you need a discovery protocol to learn what books are available and who has them, the aforementioned catalogue. Note that this catalogue needs to be updated as well, but the updates can be distributed using controlled flooding. And the reason I'm writing this whole thing is that I think you need a decent data (book) dissemination protocol. You are dealing with a sparsely connected network of resource starved nodes, with intermittent connectivity and esoteric mobility patterns. I believe you need some sort of centralised logic, such as a tracker in the Bittorrent protocol. If the computers belonging to the children that have Internet access can cooperate on which books to download and store, and if the computers themselves can try to form a rough topology of who is connected to who (in terms of the books they want), then you are in a much better position to allocate your resources to satisfy the most requests.

I hope this helps and I wish you the very best of luck in your endeavours.

Emerson Farrugia (emerson AolpcT runelands D0T net) - March 17th, 2006


E-mail client application

Perhaps will the project prefer to use a website like GMail for e-mail purposes. If not, as the author of tinymail, I'm willing to make sure tinymail will be suitable for the device. Tinymail is a project that aims to create a E-mail client development infrastructure for creating E-mail clients for small devices. At this moment it can show large IMAP and POP folders using less than 5 megabytes of memory. Tinymail is licensed as LGPL.

https://svn.cronos.be/svn/tinymail/trunk

-- Philip Van Hoof <pvanhoof at gnome dot org>

Distributed Filesystem?

Will the Wikipedia Offline fit into 512 MB (or even 1 GB) ? Even if it does, how about some software and other textbooks loaded at the same time? Clearly, the storage on one device is very limited... but: What if data could be spread over several laptops, a sort of built-in distributed filesystem like Coda or MogileFS - do these make any sense on a device like this, with the goal of enhancing storage capacity through distribution? In a school, every of say 100 children has 1/100th of Wikipedia - instead of clogging each device with a complete copy.

Vorburger 20:06, 9 February 2006 (EST)

  • Wikipedia will fit only in DVD. Small Linuxes can be embedded in that DVD, but for now, only Morphix and PuppyLinux are ready for CD/DVD media, with PuppyLinux well-suited to the limited resources of the OLPC machine.
    • Since I assume each school will also have an uplink gateway, maybe the Offline version of Wikipedia could be put on that, and cached on the individual units as they access it?
      • The english wikipedia is around 1.4GB (text only). The other language wikipedias seem to be smaller. I suppose someone could go through and delete a large amount of useless information, but that would be a pretty big job. Still, if pruning useless content from wikipedia were to become a sort of community project, I don't doubt there would be a large enough amount of volunteers that wouldn't mind spending some time on it.

Grid computing

It would be interesting if software were included to allow meshed machines to create an ad-hoc grid/cluster computer. It would be useful for things like compiling software, rendering and other CPU intensive tasks. (Stuff that I imagine some of the more advanced users, High School age, might want to do). A distributed file system would be a central part of that.

  • A practical alternative, one that can be done now, is to use content in DVD (as suggested in the previous section). Some "hotspots" covered by these DVD-augmented laptops can be setup in a community, providing distributed servers for giving out content as well as hosting discussions. As the OLPC machine has USB port, adding DVD drive to it is not difficult. - Raffy, April 27, 2006.

Better-performing Flash Filesystem

The proposed JFFS2 filesystem was designed for NOR-type Flash memory, which has very different timing characteristics from the cheaper NAND-type Flash memory used in USB thumb drives and, presumably, the laptop. YAFFS is a GPL'ed open-source journalling filesystem designed specifically for NAND Flash memory that is claimed to use less RAM for its tables and generally outperform JFFS2, and they are working on YAFFS2, which is tweaked to be faster and to work with the new larger, 2KB-page-size NAND devices.

YAFFS has the following technical advantages over JFFS2:

  • It uses NAND Flash memory better, making it faster (about 2X), more space-efficient and wearing the memory chips out less quickly
  • It is faster at mounting a filesystem: a hand-waving example of startup time for a 128MB device is 3 seconds instead of 25
  • It uses far less RAM for its internal tables
  • It scales better: JFFS2 is said to fall apart above 256MB because its internal data structures get too big while YAFFS is known to work well up to 2GB (the laptop currently aims at 512MB)
  • It stores error-correcting codes for all data, which is essential since NAND Flash is supplied not 100% perfect and degrades over time
  • YAFFS provides some features lacking from JFFS2 (hard links, memory mapped file writing)

JFFS2 has the following advantages over YAFFS:

  • It has built-in write-time data compression
  • It is included in the standard Linux kernel

YAFFS has a home page and a there is a technical article which goes into depth on the differences between NOR and NAND flash memory and the drawbacks of using JFFS2 with the NAND type.

It would be worth running comparative performance tests on the two filesystems, because there are big potential performance wins on several fronts. In-filesystem compression isn't everything, slows all file operations down and, when used without error correcting codes onto an unreliable medium, risks major data loss.

Martin Guy 4 March 2006

Jörn Engel is currently working on a new flash file system called logfs. It is not yet clear if it will hit the mainline kernel in time for consideration for the first generation laptop, but it is progressing fast. It should combine all the advantages listed for either for the two file systems above with a new clean design. In particular, the mount time and memory footprint is independent from the device size, unlike the existing file systems.

I don't think that YAFFS can be considered an option for OLPC at this point because of missing compression and the quality of the code.

arnd 12 March 2006

3d software rendering

As the system does not include hardware accelerated 3d rendering, a software rendering library may be included to wrap the OpenGL (OGL/ES maybe) API and create rendering code on the fly. This, even on a machine with limited clock speed can provide a rendering performance paragonable to that of some integrated 3d chipsets, especially if the resolution is kept low. There are some existing tools that can be leveraged for this; for example, Vincent is an OpenGL/ES implementation that provides software rendering for constrained devices like cell phones; SwShader, precursor of transgamings' SwiftShader and many others. Having (limited) OpenGL capability does add some capabilities to the device without requiring additional hardware.

Software Installation, Package Manager, Central Repository

How relevant is a polished end-user friendly Package Manager? With limited memory, are you more likely to uninstall and try another application and install back one? In the beginning, how important is it to be able to very easily get patched new versions of the software? Underlying question: Is a central repository of applications desirable? Completely open, anybody can submit their (pre-compiled) package?

Vorburger

Should there be an easy way to install and remove applications from the device without corrupting the system image? I am thinking of something like klik (http://klik.atekon.de/). -- DPalmerJr

I am on a team developing a deeply embedded losely connected ARM-based Linux system (64 MiB RAM, 512 MiB disc). We have discovered the hard way that it's best to support in-field upgrades -- right from day 1. Even with an effective release management + testing/validation team, specs will change, improvements will be made, bugs will slip through. Our devices are connected via slow satellite links and connect to our infrastructure as infrequently as once per month. We cannot feed a lot of data through the link without blowing our power budget. Even if/when we are willing to risk an over-the-air in-field upgrade, we may not have the bandwidth/power budget. We have found conventional package managers (dpkg, rpm) are too coarse-grained when dealing with skinny pipes and power budgets. A package manager supporting deltas would be preferable. We have even considered downloading source patches and re-compiling on the embedded device. Your network will be faster than ours, so YMMV.

System development + testing will benefit from a slick patch/upgrade mechanism too.

I don't think it's unreasonable to expect to upgrade the devices via the mesh cluster - upgrade one device and the rest can upgrade from it. Use public-key-encryption to sign 'blessed' packages.

I consider a well-thought-out, secure, trustable, user-controlable package management system to be critical to system stability, extensibility, maintainability, and ultimately to the success of this project. -- BCL

Laptop as USB-Drive

It would probably be useful if the laptop could be accessed as a USB-Drive, like a digital camera.. In the Software Development context hackers could probably also configure File Sharing via the WiFi... but simple "USB cross cabling" could be interesting to end-users because it's: a) most simple, b) secure, probably OK to give access to entire filesystem, if locally attached, c) doesn't need Wifi; the nearest Internet Cafe in a bigger town will let children/teacher USB-connect their laptop to one of their stations to copy over a newly downloaded application, but not have a Wifi basestation; at least not where I have travelled in India.

Vorburger

Maybe a software can be developed for this. Since the system is going to be "Linux Based", just accesing the filesystem should allow to configure almost everything. A software that gives access to the filesystem (and emulate a camera or an USB thumb), could be included. Or maybe, a special cable provided with the laptop (that uses one special of the 3 USB ports) could allow direct access to filesystem. (or with a switch somewhere in the laptop that even without power makes it work as a USB-Drive, even with the posibility of charging batteries while connected).

Gandolfi

Hard-Reset built-in

Curious kids will certainly easily manage to screw up the software side of the device - and they should! A built-in hard-reset that can re-initialize the OS etc. from ROM; sort of like some modern laptops have a hidden partition on the HDD that can re-install without the usual Recovery CD, could be useful.

You always have the problem of personal data, files, and configuration settings. Some solution for that would have to be provided; e.g. easily copy to your friend's device over the wireless network?

Vorburger

There's a problem in the Microsoft Windows world with newly-installed systems. You have to go on-line to get the latest security patches from Microsoft. But as soon as you go on-line with an unpatched system you're at risk of infection from viruses.

The reset operation could be integrated with the patch/upgrade mechanism whereby the system will only install secure signed OS-level packages until either the system or the user decides it's OK to open the doors for business. -- BCL

Font technology

Which font technology is to be used?

The character encoding will be Unicode. In that case it is important that an advanced font technology such as OpenType or Graphite is available. It is also necessary to have rendering software for screen and printer that can intelligently combine glyphs from fonts. The choices are Uniscribe (Windows), ATSUI (Macintosh), SIL Graphite (Linux or Windows), Pango (Linux and any other system that can run Free Software), or TrollTech Scribe (Linux or compatible). Of these, Graphite is the most powerful, but it is not yet in widespread use. Pango is the most widely deployed rendering engine for Linux.

The best, of course :-). Fontconfig does fonts substitution on a linguistic level, beyond what Windows and the Mac does. Pango is probably the most advanced layout library around, though further work for some scripts is needed. The graphite description says that Sil is working on integrating it with Pango. - jg

For European languages such as French and Spanish an ordinary font technology such as TrueType is fine. For languages using Latin script yet using accented characters which do not each have a precomposed Unicode character, including many in Africa, an advanced font format is necessary. This is so that glyph substitution can take place to convert a sequence of a base character followed by a combining accent into a "looks right" display. Any rendering engine with any font containing the appropriate glyphs can put an accent mark over a character, but only OpenType can specify exactly where the mark should go for best appearance.

Freetype, used by almost everything these days on open source formats, handles a plethora of font types, from Type 1, to TrueType, to OpenType; note that anyone wanting to introduce yet another font format had best be examining how to do it as a Freetype plugin - jg

Arabic script systems (Arabic, Farsi, Urdu, etc.) need an advanced font technology and an advanced rendering engine. Chinese does not need an advanced font technology system. For languages of the Indian subcontinent typewriter-like displays can be achieved without an advanced font technology. For full support of conjunct ligatures an advanced font technology is needed, and similarly for other Asian alphabets (Sinhalese, Lao, Khmer, Myanmar, Tibetan, Mongolian, etc.).

We know of some open issues with Thai & pango, but believe that they can be solved and that Pango handles most languages already (e.g. Arabic, the Indic languages. Please help determine where further work may be needed. - jg

The eutofont font format

Some time ago William Overington devised a font format using character codes from the Unicode Private Use Area.

(Note by Ed Cherlin: Every font format allows the use of PUA codes. They are used for writing systems not encoded in Unicode, such as Klingon.)

(Note by William Overington: Ed Cherlin states "Every font format allows the use of PUA codes." Yes. Yet that is an item different from what I was trying to say. I was trying to say that the eutofont font format actually uses Unicode Private Use Area code points in the font format itself with the effect that a font is expressible as a sequence of Unicode Private Use Area characters. So, if the eutofont format were used to produce a font of just the twenty-six letters of the English alphabet, all of them regular Unicode characters, the font would be a string of Unicode characters, most of them from the Private Use Area. An end user need not be aware that Private Use Area codes have been used in the producing of the font and the end user does not need to use the Private Use Area codes directly.)

As far as I know it has never been implemented. However, I mention it here in case readers might like to have a look at the documents and decide whether it might be of any use for the laptop project.

I named it the eutofont font format.

http://www.users.globalnet.co.uk/~ngo/eutofont.htm

The eutofont font format has glyph substitution facilities and also has chromatic font capability.

The system could be extended if font technology needs are required which the eutofont font format presently described does not support.

Please note the use of Fontconfig on open source systems for font naming and substitution - jg

Regarding the use of Private Use Area codes: by using them a compactness of font size would be possible which an XML based font system might not be able to achieve: in due course, if the eutofont font system were successful, maybe codes would be added to regular Unicode, though that would lose some compactness as the codes would not be in plane zero; however, in the short term the Private Use Area codes would be needed.

William Overington

11 March 2006

Automated Language Localization of some Preset Sentences

I have for some time been interested in whether it would be of practical use (rather than just fun in researching what can and cannot be done) to have a collection of sentences and part sentences defined and translated into many languages, each sentence or part sentence having a code number, with the idea that an author may construct a message using one such code number or a sequence of such code numbers and then the code numbers could be used by a software system in the computer of a recipient of the message in conjunction with a small database of code numbers and the text of the sentences in a chosen language so as to produce a localized message displayed for the recipient.

For example, suppose that there were only two sentences from which to choose and that these have been encoded as sentences 21011 and 21012.

The English database would contain the following.

21011 It is raining.

21012 It is snowing.

The French database would contain the following.

21011 Il pleut.

21012 Il neige.

The database could be translated into as many languages as desired and possible.

So, if someone whose preferred language is English is authoring a message and wishes to send the message "It is raining." then he or she looks throgh the database using whatever search tools that are available at his or her location and encodes the message as 21011 and then sends it.

So, if someone whose preferred language is French receives the message then the text "Il pleut." can be displayed automatically.

So, if there were more sentences than that and also sentences with a parameter such as for "The temperature here is P1 degrees Celsius." where the value of parameter 1 is sent as a digit string (possibly including a decimal point) to accompany the 21852 code of the parameterized sentence, and that list of sentences were available in many languages, then, for example, weather information could be broadcast on a pan-European basis on an interactive television channel and localized automatically in interactive televisions in, for example, England, France, Italy, Finland and Latvia.

As to how to encode such a system, well there are various possibilities. I started off using a deliberately unusual yet valid sequence of regular Unicode characters to act as a key that would be most unlikely to occur in any other use context, namely a comet, a circumflex accent and an enclosing keycap design. I have also looked at using Unicode Private Use Area characters. It has been suggested to me that XML would be the best approach, though I have reservations as I would like a system where a short sequence could be added into a plain text file without having to restructure the whole document, however I am unsure of that so it is possible that XML would be the way to go.

I am wondering whether the technique, whether using the key or the Private Use Area codes, or using XML, or otherwise, could be useful for autolocalizing some part of the education process. For example, a sentence such as "Please tell your teacher that you have now completed the task." and such as "You have chosen the correct answer." and "Well done.".

I did a little with the idea theoretically some time ago.

http://www.users.globalnet.co.uk/~ngo/c_c00000.htm

I never got it beyond English!

A later development was to incorporate the LOCODE concept so as to specify names of places that were to be localized, such as the way Firenze is expressed as Florence in English and London is expressed as Londres in French.

http://www.unece.org/cefact/locode/service/main.htm

William Overington

17 March 2006

Email Client requirements

Email is the only well known internet application that doesn't depend on a working TCP/IP connection to the internet. It's model is the paper postal service where there are only one or two connections per day, when the postie visits the letterbox.

It is very likely that these laptops will be in the situation where the link to the outside world will be a fragile connection running at very low speeds. If it's a modem line it's likely that the quality is so poor that echo cancellation will fail; this will limit the speed to 2400bps duplex (higher if half duplex). This is not enough for a shared web connection for thirty kids.

This is okay for email with some rules:

  • The email client must be self contained.
  • The MTA must be light and capable of very versatile store and forward without help from DNS.
  • The MTA on the client must be capable of ad-hoc forwarding. ie the child can tell it to give their mail to another client, one who's going to school today.
  • The client must have good facilities for splitting files into multiple emails (and joining) so a maximum message size of say 16kb would not be a problem.
  • The ability to put the mail on a USB key. The bandwidth of a real postie with a pocket full of USB keys could be rather high.

A good model for this might be the old FidoNet networks, though a cleaner addressing scheme would be nice.

Having just email is not as limiting as you might imagine you can access most of the internet by email.

-- Robert de Bath -- March 2006

PS: I just did the math, I've got a 1Gbyte flash key so my bandwidth on the daily commute to work is 99kbps!