System software: Difference between revisions
Line 131: | Line 131: | ||
The character encoding will be Unicode. In that case it is important that an advanced font technology such as OpenType or Graphite is available. It is also necessary to have rendering software for screen and printer that can intelligently combine glyphs from fonts. The choices are Uniscribe (Windows), ATSUI (Macintosh), SIL Graphite (Linux or Windows), Pango (Linux and any other system that can run Free Software), or TrollTech Scribe (Linux or compatible). Of these, Graphite is the most powerful, but it is not yet in widespread use. Pango is the most widely deployed rendering engine for Linux. |
The character encoding will be Unicode. In that case it is important that an advanced font technology such as OpenType or Graphite is available. It is also necessary to have rendering software for screen and printer that can intelligently combine glyphs from fonts. The choices are Uniscribe (Windows), ATSUI (Macintosh), SIL Graphite (Linux or Windows), Pango (Linux and any other system that can run Free Software), or TrollTech Scribe (Linux or compatible). Of these, Graphite is the most powerful, but it is not yet in widespread use. Pango is the most widely deployed rendering engine for Linux. |
||
:This is a rather strange comment. Why the questions? The OLPC uses Linux with GTK which includes Pango as a component. It also uses FreeType which means that the OLPC uses cross-platform TrueType fonts. |
|||
''The best, of course :-). Fontconfig does fonts substitution on a linguistic level, beyond what Windows and the Mac does. [http://www.pango.org/ Pango] is probably the most advanced layout library around, though further work for some scripts is needed. [http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=GraphiteFAQ#WhatIsGraphite The graphite description] says that Sil is working on integrating it with Pango. - jg'' |
''The best, of course :-). Fontconfig does fonts substitution on a linguistic level, beyond what Windows and the Mac does. [http://www.pango.org/ Pango] is probably the most advanced layout library around, though further work for some scripts is needed. [http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=GraphiteFAQ#WhatIsGraphite The graphite description] says that Sil is working on integrating it with Pango. - jg'' |
Revision as of 12:18, 5 June 2006
Software Ideas
Usability
Include in the OS an onscreen transluscent/watermark representation of the keyboard that indicated which key is pressed would help users learn to touch type since they would not have to look down at the keyboard, but could be turned on or off (better yet, varied in opacity from 0 to 100%). Otherwise, how many will even know to try typing while looking at the screen? It would also help in low light [as was suggested under hardware by another], as many won't have power for lighting yet would need to use it only when light for after school chores are completed, leaving only late in the day and many will be in latitudes with SHORT winter days.
System Software
A version of Touch Typing software to teach these kids to touch type, the faster you can work with a keyboard whatever age you are then the faster you can get on with solving the worlds problems and letting the world know about your solutions... ' eg unjustified government spending on military budgets that will eventully only lead to one thing, more War to justify more spending etc..."
And does someone not need to make clearer in your marketing for support of this project that it does not have to be the same person who turns the crank as types at the keyboard, that there is a shortage of electricity in developing nations not hands to turn cranks?
Peer To Peer Distribution, for Electronic Text, Software, Email
Extending the original idea from below... this is more general then just about electronic text, though. In lack of a better term, let me call it "built-in support for non-real-time Internet connectivity", provided as shared service and usable by apps.
For example, I myself often read some web pages that I had downloaded while on the network at home while traveling, disconnected from a network, and of course when clicking on a link you get some stupid technical error message. Why can't the thing remember I want to read the linked page later and "queue" it somewhere? This idea is probably more much more relevant in some OLPC scenarios than it is for myself; what if you are connected to the "Internet by Motorbike" say only once every two weeks, as in the Motoman project in Cambodia?
This applies to many forms of data, from electronic content be it a complete ebook, HTML page, Email or some software to download - or publishing of content such as homepage or blog updates, etc. (I think OneWorld has an XML-based publishing along those lines; but could be confusing it with something else.) Making it possible (and easy!) to request, and publish, data from one device, which then forwards the reqest to another, and ultimately forward to Internet when connected. Doesn't it make you feel like good ol' FIDO Net is back?
Vorburger 20:06, 9 February 2006 (EST)
Distributed Filesystem?
Will the Wikipedia Offline fit into 512 MB (or even 1 GB) ? Even if it does, how about some software and other textbooks loaded at the same time? Clearly, the storage on one device is very limited... but: What if data could be spread over several laptops, a sort of built-in distributed filesystem like Coda or MogileFS - do these make any sense on a device like this, with the goal of enhancing storage capacity through distribution? In a school, every of say 100 children has 1/100th of Wikipedia - instead of clogging each device with a complete copy.
Vorburger 20:06, 9 February 2006 (EST)
- Wikipedia will fit only in DVD. Small Linuxes can be embedded in that DVD, but for now, only Morphix and PuppyLinux are ready for CD/DVD media, with PuppyLinux well-suited to the limited resources of the OLPC machine.
- Since I assume each school will also have an uplink gateway, maybe the Offline version of Wikipedia could be put on that, and cached on the individual units as they access it?
- The english wikipedia is around 1.4GB (text only). The other language wikipedias seem to be smaller. I suppose someone could go through and delete a large amount of useless information, but that would be a pretty big job. Still, if pruning useless content from wikipedia were to become a sort of community project, I don't doubt there would be a large enough amount of volunteers that wouldn't mind spending some time on it.
Grid computing
It would be interesting if software were included to allow meshed machines to create an ad-hoc grid/cluster computer. It would be useful for things like compiling software, rendering and other CPU intensive tasks. (Stuff that I imagine some of the more advanced users, High School age, might want to do). A distributed file system would be a central part of that.
- A practical alternative, one that can be done now, is to use content in DVD (as suggested in the previous section). Some "hotspots" covered by these DVD-augmented laptops can be setup in a community, providing distributed servers for giving out content as well as hosting discussions. As the OLPC machine has USB port, adding DVD drive to it is not difficult. - Raffy, April 27, 2006.
Better-performing Flash Filesystem
The proposed JFFS2 filesystem was designed for NOR-type Flash memory, which has very different timing characteristics from the cheaper NAND-type Flash memory used in USB thumb drives and, presumably, the laptop. YAFFS is a GPL'ed open-source journalling filesystem designed specifically for NAND Flash memory that is claimed to use less RAM for its tables and generally outperform JFFS2, and they are working on YAFFS2, which is tweaked to be faster and to work with the new larger, 2KB-page-size NAND devices.
YAFFS has the following technical advantages over JFFS2:
- It uses NAND Flash memory better, making it faster (about 2X), more space-efficient and wearing the memory chips out less quickly
- It is faster at mounting a filesystem: a hand-waving example of startup time for a 128MB device is 3 seconds instead of 25
- It uses far less RAM for its internal tables
- It scales better: JFFS2 is said to fall apart above 256MB because its internal data structures get too big while YAFFS is known to work well up to 2GB (the laptop currently aims at 512MB)
- It stores error-correcting codes for all data, which is essential since NAND Flash is supplied not 100% perfect and degrades over time
- YAFFS provides some features lacking from JFFS2 (hard links, memory mapped file writing)
JFFS2 has the following advantages over YAFFS:
- It has built-in write-time data compression
- It is included in the standard Linux kernel
YAFFS has a home page and a there is a technical article which goes into depth on the differences between NOR and NAND flash memory and the drawbacks of using JFFS2 with the NAND type.
It would be worth running comparative performance tests on the two filesystems, because there are big potential performance wins on several fronts. In-filesystem compression isn't everything, slows all file operations down and, when used without error correcting codes onto an unreliable medium, risks major data loss.
Martin Guy 4 March 2006
Jörn Engel is currently working on a new flash file system called logfs. It is not yet clear if it will hit the mainline kernel in time for consideration for the first generation laptop, but it is progressing fast. It should combine all the advantages listed for either for the two file systems above with a new clean design. In particular, the mount time and memory footprint is independent from the device size, unlike the existing file systems.
I don't think that YAFFS can be considered an option for OLPC at this point because of missing compression and the quality of the code.
arnd 12 March 2006
Some corrections to the YAFFS marketing blurb:
- Error correction is done by the NAND subsystem and not by the filesystem. It's a necessarity for NAND FLASH and the NAND subsystem provides that protection since the very beginning. JFFS2 just uses whats there. No need to reinvent the wheel.
- JFFS2 worked on 2k page size chips before YAFFS2 showed up
- JFFS2 has raised the bar in the boottime and scaling p*ssing contest. David improved mount time of a 512MiB FLASH down to less than 8 seconds and the RAM consumption has been reduced significantly too.
- JFFS2 works out of the box with the MTD subsystem while YAFFS needs tewaks and patches and is hard to adopt to hardware ECC controllers
3d software rendering
As the system does not include hardware accelerated 3d rendering, a software rendering library may be included to wrap the OpenGL (OGL/ES maybe) API and create rendering code on the fly. This, even on a machine with limited clock speed can provide a rendering performance paragonable to that of some integrated 3d chipsets, especially if the resolution is kept low. This could allow educationnal software to use 3d rendering (physics and mathematics softwares could take advantage of this). There are some existing tools that can be leveraged for this; for example, Vincent is an OpenGL/ES implementation that provides software rendering for constrained devices like cell phones; SwShader, precursor of transgamings' SwiftShader and many others. Having (limited) OpenGL capability does add some capabilities to the device without requiring additional hardware.
Software Installation, Package Manager, Central Repository
How relevant is a polished end-user friendly Package Manager? With limited memory, are you more likely to uninstall and try another application and install back one? In the beginning, how important is it to be able to very easily get patched new versions of the software? Underlying question: Is a central repository of applications desirable? Completely open, anybody can submit their (pre-compiled) package?
Should there be an easy way to install and remove applications from the device without corrupting the system image? I am thinking of something like klik (http://klik.atekon.de/). -- DPalmerJr
I am on a team developing a deeply embedded losely connected ARM-based Linux system (64 MiB RAM, 512 MiB disc). We have discovered the hard way that it's best to support in-field upgrades -- right from day 1. Even with an effective release management + testing/validation team, specs will change, improvements will be made, bugs will slip through. Our devices are connected via slow satellite links and connect to our infrastructure as infrequently as once per month. We cannot feed a lot of data through the link without blowing our power budget. Even if/when we are willing to risk an over-the-air in-field upgrade, we may not have the bandwidth/power budget. We have found conventional package managers (dpkg, rpm) are too coarse-grained when dealing with skinny pipes and power budgets. A package manager supporting deltas would be preferable. We have even considered downloading source patches and re-compiling on the embedded device. Your network will be faster than ours, so YMMV.
System development + testing will benefit from a slick patch/upgrade mechanism too.
I don't think it's unreasonable to expect to upgrade the devices via the mesh cluster - upgrade one device and the rest can upgrade from it. Use public-key-encryption to sign 'blessed' packages.
I consider a well-thought-out, secure, trustable, user-controlable package management system to be critical to system stability, extensibility, maintainability, and ultimately to the success of this project. -- BCL
Laptop as USB-Drive
It would probably be useful if the laptop could be accessed as a USB-Drive, like a digital camera.. In the Software Development context hackers could probably also configure File Sharing via the WiFi... but simple "USB cross cabling" could be interesting to end-users because it's: a) most simple, b) secure, probably OK to give access to entire filesystem, if locally attached, c) doesn't need Wifi; the nearest Internet Cafe in a bigger town will let children/teacher USB-connect their laptop to one of their stations to copy over a newly downloaded application, but not have a Wifi basestation; at least not where I have travelled in India.
- Why take the laptop to the big town when you can take a thumbdrive instead. Better yet, why not just wait for the content to come to you on a CD-ROM. Send an email by motorcycle-net to order the content you want, and next week, the motorcycle brings it on CD during the regular delivery. Works in Vietnam.
Maybe a software can be developed for this. Since the system is going to be "Linux Based", just accesing the filesystem should allow to configure almost everything. A software that gives access to the filesystem (and emulate a camera or an USB thumb), could be included. Or maybe, a special cable provided with the laptop (that uses one special of the 3 USB ports) could allow direct access to filesystem. (or with a switch somewhere in the laptop that even without power makes it work as a USB-Drive, even with the posibility of charging batteries while connected).
Hard-Reset built-in
Curious kids will certainly easily manage to screw up the software side of the device - and they should! A built-in hard-reset that can re-initialize the OS etc. from ROM; sort of like some modern laptops have a hidden partition on the HDD that can re-install without the usual Recovery CD, could be useful.
You always have the problem of personal data, files, and configuration settings. Some solution for that would have to be provided; e.g. easily copy to your friend's device over the wireless network?
This is a very good point. If we use a compressed read-only file (or partition) with most of the filesystem (specially the part under /usr) we can not only stuff a lot more software in there, but also resetting would be a much simpler operation. Basically all it had to do was to untar a "factory default" tar file (or something like that) into the writtable part of the flash storage.
We could have a boot option, where the user would type "reset" or something like that, to boot a "rescue" kernel and initrd that just did this operation. -- Paulo Marques
There's a problem in the Microsoft Windows world with newly-installed systems. You have to go on-line to get the latest security patches from Microsoft. But as soon as you go on-line with an unpatched system you're at risk of infection from viruses.
The reset operation could be integrated with the patch/upgrade mechanism whereby the system will only install secure signed OS-level packages until either the system or the user decides it's OK to open the doors for business. -- BCL
Font technology
Which font technology is to be used?
The character encoding will be Unicode. In that case it is important that an advanced font technology such as OpenType or Graphite is available. It is also necessary to have rendering software for screen and printer that can intelligently combine glyphs from fonts. The choices are Uniscribe (Windows), ATSUI (Macintosh), SIL Graphite (Linux or Windows), Pango (Linux and any other system that can run Free Software), or TrollTech Scribe (Linux or compatible). Of these, Graphite is the most powerful, but it is not yet in widespread use. Pango is the most widely deployed rendering engine for Linux.
- This is a rather strange comment. Why the questions? The OLPC uses Linux with GTK which includes Pango as a component. It also uses FreeType which means that the OLPC uses cross-platform TrueType fonts.
The best, of course :-). Fontconfig does fonts substitution on a linguistic level, beyond what Windows and the Mac does. Pango is probably the most advanced layout library around, though further work for some scripts is needed. The graphite description says that Sil is working on integrating it with Pango. - jg
For European languages such as French and Spanish an ordinary font technology such as TrueType is fine. For languages using Latin script yet using accented characters which do not each have a precomposed Unicode character, including many in Africa, an advanced font format is necessary. This is so that glyph substitution can take place to convert a sequence of a base character followed by a combining accent into a "looks right" display. Any rendering engine with any font containing the appropriate glyphs can put an accent mark over a character, but only OpenType can specify exactly where the mark should go for best appearance.
Freetype, used by almost everything these days on open source formats, handles a plethora of font types, from Type 1, to TrueType, to OpenType; note that anyone wanting to introduce yet another font format had best be examining how to do it as a Freetype plugin - jg
Arabic script systems (Arabic, Farsi, Urdu, etc.) need an advanced font technology and an advanced rendering engine. Chinese does not need an advanced font technology system. For languages of the Indian subcontinent typewriter-like displays can be achieved without an advanced font technology. For full support of conjunct ligatures an advanced font technology is needed, and similarly for other Asian alphabets (Sinhalese, Lao, Khmer, Myanmar, Tibetan, Mongolian, etc.).
We know of some open issues with Thai & pango, but believe that they can be solved and that Pango handles most languages already (e.g. Arabic, the Indic languages. Please help determine where further work may be needed. - jg
The eutofont font format
Some time ago William Overington devised a font format using character codes from the Unicode Private Use Area.
(Note by Ed Cherlin: Every font format allows the use of PUA codes. They are used for writing systems not encoded in Unicode, such as Klingon.)
(Note by William Overington: Ed Cherlin states "Every font format allows the use of PUA codes." Yes. Yet that is an item different from what I was trying to say. I was trying to say that the eutofont font format actually uses Unicode Private Use Area code points in the font format itself with the effect that a font is expressible as a sequence of Unicode Private Use Area characters. So, if the eutofont format were used to produce a font of just the twenty-six letters of the English alphabet, all of them regular Unicode characters, the font would be a string of Unicode characters, most of them from the Private Use Area. An end user need not be aware that Private Use Area codes have been used in the producing of the font and the end user does not need to use the Private Use Area codes directly.)
As far as I know it has never been implemented. However, I mention it here in case readers might like to have a look at the documents and decide whether it might be of any use for the laptop project.
I named it the eutofont font format.
http://www.users.globalnet.co.uk/~ngo/eutofont.htm
The eutofont font format has glyph substitution facilities and also has chromatic font capability.
The system could be extended if font technology needs are required which the eutofont font format presently described does not support.
Please note the use of Fontconfig on open source systems for font naming and substitution - jg
Regarding the use of Private Use Area codes: by using them a compactness of font size would be possible which an XML based font system might not be able to achieve: in due course, if the eutofont font system were successful, maybe codes would be added to regular Unicode, though that would lose some compactness as the codes would not be in plane zero; however, in the short term the Private Use Area codes would be needed.
William Overington
11 March 2006
Automated Language Localization of some Preset Sentences
I have for some time been interested in whether it would be of practical use (rather than just fun in researching what can and cannot be done) to have a collection of sentences and part sentences defined and translated into many languages, each sentence or part sentence having a code number, with the idea that an author may construct a message using one such code number or a sequence of such code numbers and then the code numbers could be used by a software system in the computer of a recipient of the message in conjunction with a small database of code numbers and the text of the sentences in a chosen language so as to produce a localized message displayed for the recipient.
For example, suppose that there were only two sentences from which to choose and that these have been encoded as sentences 21011 and 21012.
The English database would contain the following.
21011 It is raining.
21012 It is snowing.
The French database would contain the following.
21011 Il pleut.
21012 Il neige.
The database could be translated into as many languages as desired and possible.
So, if someone whose preferred language is English is authoring a message and wishes to send the message "It is raining." then he or she looks throgh the database using whatever search tools that are available at his or her location and encodes the message as 21011 and then sends it.
So, if someone whose preferred language is French receives the message then the text "Il pleut." can be displayed automatically.
So, if there were more sentences than that and also sentences with a parameter such as for "The temperature here is P1 degrees Celsius." where the value of parameter 1 is sent as a digit string (possibly including a decimal point) to accompany the 21852 code of the parameterized sentence, and that list of sentences were available in many languages, then, for example, weather information could be broadcast on a pan-European basis on an interactive television channel and localized automatically in interactive televisions in, for example, England, France, Italy, Finland and Latvia.
As to how to encode such a system, well there are various possibilities. I started off using a deliberately unusual yet valid sequence of regular Unicode characters to act as a key that would be most unlikely to occur in any other use context, namely a comet, a circumflex accent and an enclosing keycap design. I have also looked at using Unicode Private Use Area characters. It has been suggested to me that XML would be the best approach, though I have reservations as I would like a system where a short sequence could be added into a plain text file without having to restructure the whole document, however I am unsure of that so it is possible that XML would be the way to go.
I am wondering whether the technique, whether using the key or the Private Use Area codes, or using XML, or otherwise, could be useful for autolocalizing some part of the education process. For example, a sentence such as "Please tell your teacher that you have now completed the task." and such as "You have chosen the correct answer." and "Well done.".
I did a little with the idea theoretically some time ago.
http://www.users.globalnet.co.uk/~ngo/c_c00000.htm
I never got it beyond English!
A later development was to incorporate the LOCODE concept so as to specify names of places that were to be localized, such as the way Firenze is expressed as Florence in English and London is expressed as Londres in French.
http://www.unece.org/cefact/locode/service/main.htm
William Overington
17 March 2006
Email Client requirements
Email is the only well known internet application that doesn't depend on a working TCP/IP connection to the internet. It's model is the paper postal service where there are only one or two connections per day, when the postie visits the letterbox.
It is very likely that these laptops will be in the situation where the link to the outside world will be a fragile connection running at very low speeds. If it's a modem line it's likely that the quality is so poor that echo cancellation will fail; this will limit the speed to 2400bps duplex (higher if half duplex). This is not enough for a shared web connection for thirty kids.
This is okay for email with some rules:
- The email client must be self contained.
- The MTA must be light and capable of very versatile store and forward without help from DNS.
- The MTA on the client must be capable of ad-hoc forwarding. ie the child can tell it to give their mail to another client, one who's going to school today.
- The client must have good facilities for splitting files into multiple emails (and joining) so a maximum message size of say 16kb would not be a problem.
- The ability to put the mail on a USB key. The bandwidth of a real postie with a pocket full of USB keys could be rather high.
A good model for this might be the old FidoNet networks, though a cleaner addressing scheme would be nice.
Having just email is not as limiting as you might imagine you can access most of the internet by email.
-- Robert de Bath -- March 2006
PS: I just did the math, I've got a 1Gbyte flash key so my bandwidth on the daily commute to work is 99kbps!
Motorcycle E-mail Network
This is an excellent idea and should be part of the core OLPC project. Here is how it is currently being done in rural Cambodia. http://www.parish-without-borders.net/cditt/cambodia/dailylife/2004/rural-internet.htm
Remember, the OLPC is NOT A LAPTOP. It is a system comprising laptops, children, teachers, applications, content, USB-devices, etc.
WLAN MAC Address
There might be privacy issues related to the WLAN MAC address. The MAC is somehow similar to the unique serial number in the CPU-ID except that it is additionally broadcast around. "Quick, she/he is leaving, lets start eating the apples." A WLAN mesh might allow for relatively fine grained position tracking.
Network Protocol
I think the most important single choice is the mesh protocol, because it is likely to have a longer deployment than any implementation of the hardware, OS, or application software.
I figured that the best mesh protocol would minimize total routing waste, in order to reduce power use. Computation will use less power as technology advances, but transmission power is going to be limited by physics at some point.
I researched mesh protocols at the wikipedia.
The hazy-sighted link state protocol just stood out among the choices. It is mathematically optimized to minimize network waste. This means that it minimizes power and won't be easily improved-upon. It also has a fairly old, well-debugged, publicly deployed open source implementation that runs on diverse hardware, and is about the right size and shape (small).
The least surprising choice is probably OLSR (which periodically floods the network with limited routing data). The simplest protocol is probably AODV (a distance vector protocol that floods the network with routing information), The others seem to be research projects, or proprietary, and I would avoid them, even though some are specifically geared to power saving.
Ray Van De Walker 10:34, 26 May 2006 (EDT)