Network principles

From OLPC
Revision as of 13:37, 28 April 2008 by CScott (talk | contribs) (Update principles section; moving on to concrete proposal.)
Jump to: navigation, search
  This page is monitored by the OLPC team.


Pencil.png NOTE: The contents of this page are not set in stone, and are subject to change!

This page is a draft in active flux ...
Please leave suggestions on the talk page.

Pencil.png

OLPC's deployment strategy places laptops in many different network environments, with varying services available and levels of connectivity. In the first part of this document, we propose four basic principles for the software comprising our network stack. These principles divide our connectivity goals into functionality OLPC endeavors to provide and tasks that are the responsibility (or option) of the deployment. Further, they guide our software architecture, distinguishing the new services OLPC will design and implement from the traditional network services we will utilize and aggressively leverage. So far as possible, even the new services OLPC provides will attempt to reuse existing network protocols or userland APIs, and to remain compatible with other hosts on the inter-networks we create.

In the second part of this document, we dive more deeply into concrete proposals for network software architecture adhering to these principles. We propose a naming scheme for XOs and describe how it can be implemented using dynamic DNS. We then discuss protocols for friend and resource discovery, and describe how these can be effectively decoupled from the core XO software to allow community development of alternate directories and browsers.

Network Principles

We propose four basic principles underlying our network infrastructure and software architecture: disconnected local networks, direct peer communication, human-readable globally unique identifiers, and direct presence and status queries.

No Assumption of Universal Connectivity

We cannot assume universal connectivity between arbitrary XOs, or even connectivity between XOs and points on the global internet. Every deployment is a walled garden of some size; some deployments may endeavor to include the entire global internet inside that garden (network weather permitting); other deployments may include only a single school inside its walls; and in the limit case the "garden" may consist of only one or two XOs underneath a tree.

We will always endeavor to provide the best service possible within whatever walls we find ourselves. If we can only communicate with other XOs within a school, we will provide full collaboration within that school even though students cannot collaborate with other schools. If direct connectivity is possible between schools within a certain deployment, we fully support collaboration between schools, even if connectivity to the global internet is not available.

This principle also allows us to gracefully handle many kinds of network degradation, present in any real-world deployment. Even in the face of transient asymmetries or disconnections within the network we will provide the "best possible" behavior with the fullest achievable range of services. Central failures may interfere with collaboration among far-flung sites, but they should never interfere with the ability to use the XO locally.

Direct XO-to-XO peer communication

Our basic networking features are built around direct XO-to-XO peer-to-peer communication. Additional servers may be used as aides or proxies, but the fundamental means to query the state of an XO or to collaborate with its user is to directly connect to it.

By direct communication we mean the standard socket API and IP protocols on which the internet is built. We are not building an overlay network or providing bespoke routing or addressing services; we are using the "plain old network". Auxiliary servers or services may be utilized, but only on an "as available" basis or transparently (as in the case of routers or tunnels) consistent with the end-to-end principle.

There are three main ways in which direct XO-to-XO communication fails:

NATs or firewalls are used for control.
In some deployments, XOs inside a school are placed behind a firewall or NAT to enforce web filtering or other control mechanisms. This can be unfortunate, but OLPC will not attempt to subvert this. We acknowledge that these mechanisms may create walled gardens, but that decision is the responsibility of the deployment. If the school wants to provide collaboration with other schools or individuals outside its walled garden, it is the deployment's responsibility to poke holes in its walls, not OLPC's.
NATs used due to limited IPv4 address space.
In some deployments, NATs or firewalls are used to work around limits in the number of available globally-routable IPv4 addresses. IPv6 is an excellent solution to this problem, but IPv6 deployment is quite poor at present. OLPC cannot single-handedly deploy IPv6 to every school endpoint; it is the responsibility of the schools or deployments to provide IPv6 connectivity, via 6-to-4 or tunneling. Again, walled gardens may result from the limited IPv4 address space; if collaboration past the NAT is desired, deployments must provide the IPv4 or IPv6 connectivity required to enable this. (OLPC will probably endeavor to assemble a best-of-breed IPv6 tunnel solution as an aid to deployment teams, but locating, installing, and maintaining the IPv6 endpoint is the responsibility of the deployment.)
NATs or firewalls used for other reasons.
In some instances NATs or firewalls may be imposed externally due to circumstances beyond direct control, and are not actually desired by the deployment. Again, the IPv4 or IPv6 tunneling needed to bypass this is the responsibility of the deployment. We will make an effort to use http/port 80 for services to provide "best possible operation" when reasonable.

Human-readable unique identifiers for each XO

Direct communication implies that there is a usable address for each XO. We insist that this address be (a) human-readable, (b) invariant and indirect, and (c) globally unique.

Human-readable names promote compatibility with other network hosts: the addressing information used to connect to an XO via ssh or a chat client should be concise and logical. As a counter-example, we will state that using a 256-bit hash of a public key as the primary identification mechanism is not acceptable; even using a scheme convert binary information into quasi-readable word sequences does not yield an acceptably concise or logical address. An address for student 'Scott' at a school in Cambridge, MA, USA, should look something like scott.1cc-cambridge-ma.us.xs.laptop.org; we will describe a concrete proposal below. A key detail here is that this name identifies the XO, not the school server or some other helper. This allows direct communication consistent with the previous principle.

Although the name is a direct reference to the machine, the mapping from name to routable address is indirect. There may be several means to map the name to an address or service on the machine, and some of these means might take advantage of proxies or servers. The key property is that there is a single name by which anyone anywhere on the network uses to refer to a particular XO -- the names do not depend on the means by which the name is mapped to an address, route or service. Because of the walled garden reality, not everyone will be able to successfully map a name to a usable address, route, or service at all times --- but when network conditions change (the student goes home, school connectivity is restored, etc) the invariant name will be usable. Use of a shortname like 'scott' for link-local communications violates this principle.

Globally-unique names ensure freedom of conflicts when walled gardens merge or students go home to a different network environment. This implies some central coordination, but not much: if deployments are assigned unique prefixes by OLPC, and schools are assigned unique names by the deployment, then we need only ensure that (for example) students are assigned unique names within the school. This can either be performed probabilistically in the absence of a school server (see proposal below) or by registration with a school server. Collision-detection (like in Apple's zeroconf/Bonjour) is a big help in practice, allowing the uniqueness constraint to be enforced "on-demand" when necessary.

The strong constraints on identifiers proposed in this section have privacy implications, which we propose to address using a capable pseudonym system. We only insist that there is at least one name for each XO. To preserve privacy we insist that there may be many such names. The name scott.1cc-cambridge-ma.us.xs.laptop.org may be the official name used for schoolwork, but the child should be able to also setup freedom76.olpc.cypherpunks.org (for example) as a pseudonym for extracurricular work. In this example, the cypherpunks infrastructure could provide additional privacy using protected proxies, onion routing, and other features to whatever degree desired. Ideally, selection among numerous pseudonyms would be made easily on the home screen of the Sugar interface.

As a practical matter, note that the requirements outlined here match standard DNS very well. There are A and AAAA records to map the invariant names indirectly to addresses, as well as extensible record types (DNS-SD) to map the names to other services. Domain names are designed to be human-readable, and there are existing standards to extend DNS to non-ASCII character sets. The delegation mechanisms for subdomains are appropriate delegation mechanisms to give countries control of their namespaces. There may be multiple mechanisms to perform DNS services for the user -- selecting from different available servers and protocols as well as variants like mDNS -- but we believe that the DNS API is the right abstraction for this task.

Direct presence interrogation

This document proposes a strict separation between discovery and presence mechanisms. Discovery concerns the mechanisms used to find collaboration partners, other students in your classroom, or the teacher's XO in order to participate in guided classwork. A key principle is that discovery is unconstrained. We will attempt to provide best-of-breed mechanisms for discovering other local hosts, but in practice most non-local discovery needs to piggyback on social mechanisms: XO names posted using Orkut or other social networking sites, added to local or global purpose-specific wikis or static web pages, or communicated over existing IM or email networks. The concise and human readable names we assign to XOs aid their spread using these existing mechanisms, and we anticipate a variety of third-party "friend finder" activities which will expand the discovery possibilities further. One such might list the local XOs whose users are interested in chess, for example, as a way to find game partners when traveling to new network environments.

Presence is the means by which the current status of these discovered friends or resources is ascertained. Although names are invariant, there is no guarantee of connectivity with a particular friend at any given time, and even if the friend is accessible, they may not be playing checkers rather than chess right now, for example.

The fundamental presence mechanism is direct: one XO connects directly to a service running on the other and queries for its status. Although the number of possible links in a network grows according to the square of the number of nodes, the interconnectedness of real social networks is quite limited: most users have 20 or so friends, with a few "super nodes" having 100 or so. Directly querying "real friends" should not be expensive in bandwidth or time.

That said, our existing presence mechanisms also attempt to determine the presence of a potentially unbounded number of "fake friends", who share the same local network or "school server" but who are not actually known to the user. To manage this case, we strictly limit the rate and size of presence queries. A suggested limit is 1 query per second, with 1kB maximum query and response sizes. This limit ensures that overall traffic scales reasonably in the worst case even as the number of nodes on the network increases.

We suggest the following basic presence algorithm: first, find the friend whose status information is "most out of date" (likely weighting "real friends" over "fake friends"). Directly query that friend, updating the "last checked" information in their status even if the query fails. Wait an appropriate amount of time to satisfy the rate limit, and repeat.

We have formulated the above algorithm so that it transparently handles the bleed-through between discovery and presence which is inherent with some local discovery technologies. For example, if we are using Cerebro to find "fake friends" on the local mesh, it will also provide basic presence information for those friends in the same protocol messages. This information updates our local status behind the back of the presence algorithm, saving our direct rate-limited queries for non-local friends outside the scope of Cerebro. Similar arguments hold for "fake friends" discovered via a school server or other mechanism. The key point here is that all hosts should support direct interrogation for presence, even if other efficient mechanisms are used for aggregate presence in some situations.

Below we will propose using a lightweight XMPP server on the XO to provide a standardized presence service which interoperates well with the "buddy presence" mechanisms used by existing Jabber-compatible IM clients, like Pidgin and Google Talk.

The purpose of discovery and presence information is fundamentally to enable collaboration. Our principles above dictate that collaboration mechanisms are built fundamentally using direct peer-to-peer communication. There may be several such collaboration technologies we deploy, including direct socket connection for legacy applications to non-XO hosts, wrapped sockets for legacy protocol communications to XO hosts (which can provide stronger end-to-end authentication and security), as well as multi-pointer X, Tubes, RPC mechanisms, and other related technologies.

An architecture proposal

FIXME: CONTENT BEYOND THIS POINT IS NOT WRITTEN; I JUST HAVE THE DRAFTS AND NOTES BELOW.

Dynamic DNS / DNS-SD w/o remote notification for discovery.

I propose using DNS names for XOs which look like:

I propose installing a handler for a new URI type in our browse application. The links will look like:

 friend:name.xxx.school.country.xs.laptop.org

where:

name is a Punycode encoding of the XO nickname.  Technically, the

IDN ToASCII mapping operation is performed on the nickname, truncated on the right if necessary so the result is 63 characters or less; see http://en.wikipedia.org/wiki/Internationalized_domain_name.

xxx is an encoded version of the the XO public key.  The number is

written in a variable base number system where the first three digits are base 36, base 37, and base 26 and the digits are mapped into characters starting with lowercase alphabetic, then numeric, then a hyphen. If I've done my math correctly (http://en.wikipedia.org/wiki/Birthday_paradox ), this requires about 220 students to have the same name before a collision has a 50% chance of occuring. If the server uses an independent means to prevent duplicate nicknames, the xxx can be replaced with 'fun'.

'school.country.xs.laptop.org' is filled in by registration with a

school server. If you do not have access to a school server, then you can register with xofriends.org (or another independent service) and use that suffix.

Clicking on a link of this form would add this person to your buddy list. Communicating with a this form of buddy would, in parallel, (a) attempt to contact the IPv6 Link-Local address formed from the lower 64 bits of the SHA-256 of the complete friend domain string (not including the URI scheme or colon) and (b) attempt to look up the hostname and contact the IPv4 or IPv6 address returned. (If the DNS responds, you SHOULD use this address for further communication in this session, since it may persist even if you roam off your current mesh.) A simple service at a well-known port would confirm status and list sharable activities.

Via a network manager hook, XOs should report their current IPv4/IPv6 addresses to the 'school.country.xs.laptop.org' part of their local domain name, which will export it via the standard dynamic DNS mechanisms.


Rather than invent a new 'friend' URI scheme, an alternative is to use the standard XMPP scheme:

xmpp:xo@nickname.xxx.school.country.xs.laptop.org?roster;name=Full%20Name

(see http://www.xmpp.org/extensions/xep-0147.html ) (Thanks, Robert McQueen.)

The xmpp server on the laptop responds to roster and chat requests at that address like a 'normal' jabber client, so that we interoperate with iChat, etc. When a school server is present, it may publish a SRV record for _xmpp-server._tcp.nickname.xxx.school.country.xs.laptop.org specifying that it is handling xmpp requests for that user. (See http://tools.ietf.org/html/rfc3921#page-88 ).

When attempting to connect to the xmpp server on an XO, we extend rfc3921 in one regard: after (or in parallel with) attempting to resolve the hostname via DNS, we hash it into a link-local address and attempt to contact an xmpp server at that address.


Tomeu asked me to elaborate on the user-facing use case that this proposal is addressing.

Fundamentally, I'm trying to get us out of the 'directory' business. We need to provide developers the opportunity to implement other means to find friends and activity partners, whether via established technology (facebook, livejournal, wiki, static web pages, google talk friends lists) or new ideas (Cerebro, "friend finder" activities, etc). Without discounting the neighborhood view entirely for location-specific queries (where's the nearest person who wants to play chess?) this opens the door for other ways to find buddies, esp. non-local buddies.

More relevant for Peru, this mechanism as described can also function without the interaction of any server. Of course, I'm not (immediately) addressing the question of how the friends are discovered -- but the point is to allow lots of different solutions to the discovery process to exist. Maybe you type them in by hand, maybe you use the camera to scan a bar code, maybe you use mDNS, etc, etc. All I'm defining is what a buddy address looks like after you 'discover' it, and how you can establish a direct IP connection to your buddy.


Credits

Author
C. Scott Ananian (cscott a t laptop.org)