Network principles: Difference between revisions

From OLPC
Jump to navigation Jump to search
m (→‎DNS names: Fix formatting.)
No edit summary
 
(19 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{OLPC}}
{{OLPC}}
{{draft}}
{{outdated}}
{{TOCright}}

OLPC's deployment strategy places laptops in many different network
OLPC's deployment strategy places laptops in many different network
environments, with varying services available and levels of
environments, with varying services available and levels of
Line 56: Line 56:
with the ability to use the XO locally.
with the ability to use the XO locally.


=== Direct XO-to-XO peer communication ===
=== Direct XO-to-XO serverless communication ===


Our basic networking features are built around direct XO-to-XO
Our basic networking features are built around direct XO-to-XO
peer-to-peer communication. Additional servers may be used as aides
'''serverless''' (peer-to-peer) communication.
Additional servers may be used as aides
or proxies, but the fundamental means to query the state of an XO or
or proxies, but the canonical means to query the state of an XO or
to collaborate with its user is to directly connect to it.
to collaborate with its user is to directly connect to it; no server
need be involved.


By direct communication we mean the standard socket API and IP
By '''direct''' communication we mean the standard socket API and IP
protocols on which the internet is built. We are not building an
protocols on which the internet is built. We are not building an
overlay network or providing bespoke routing or addressing services;
overlay network or providing bespoke routing or addressing services;
we are using the "plain old network". Auxiliary servers or services
we are using the "plain old network". Auxiliary servers, tunnels, or services
may be utilized, but only on an "as available" basis or transparently
may be utilized, but only on an "as available" basis or transparently
(as in the case of routers or tunnels) consistent with the
(as in the case of routers or tunnels) consistent with the
Line 92: Line 94:
'''Human-readable''' names promote compatibility with other network hosts:
'''Human-readable''' names promote compatibility with other network hosts:
the addressing information used to connect to an XO via ssh or a chat
the addressing information used to connect to an XO via ssh or a chat
client should be concise and logical. As a counter-example, we will
client from a non-XO host should be concise and logical ("[http://www.skyhunter.com/marcs/petnames/IntroPetNames.html memorable]" using Marc Stiegler's terminology). As a counter-example, we will
state that using a 256-bit hash of a public key as the
state that using a 256-bit hash of a public key as the
primary identification mechanism is not acceptable; even using a
primary identification mechanism is not acceptable; even using a
Line 181: Line 183:
may be playing checkers rather than chess right now, for example.
may be playing checkers rather than chess right now, for example.


The fundamental presence mechanism is direct: one XO connects directly
The canonical presence mechanism is direct: one XO connects directly
to a service running on the other and queries for its status.
to a service running on the other and queries for its status.
Although the number of possible links in a network grows according to
Although the number of possible links in a network grows according to
Line 190: Line 192:


That said, our existing presence mechanisms also attempt to determine
That said, our existing presence mechanisms also attempt to determine
the presence of a potentially unbounded number of "fake friends", who
the presence of a potentially unbounded number of ''strangers'', who
share the same local network or "school server" but who may not be
share the same local network or school server but who may not be
actually known to the user. To manage this case, we
actually known to the user. To manage this case, we
'''strictly limit the rate and size of presence queries'''.
'''strictly limit the rate and size of presence queries'''.
A suggested limit is 1 query per second, with 1kB maximum query and
A suggested limit is 1 query per second, with 1kB maximum query and
response sizes. This limit ensures that overall traffic scales
response sizes. This limit ensures that overall traffic scales
reasonably in the worst case even as the number of potential "fake
reasonably in the worst case even as the number of potential strangers
friends" on the network increases.
on the network increases.


We suggest the following basic presence algorithm: first, find the
We suggest the following basic presence algorithm: first, find the
friend whose status information is "most out of date", favoring real
friend whose status information is "most out of date", favoring real
friends over fake friends according to some weight mechanism.
friends over strangers according to some weight mechanism.
Directly query that out-of-date friend, updating the "last checked"
Directly query that out-of-date friend, updating the "last checked"
information in their status even if the query fails. Wait an
information in their status even if the query fails. Wait an
Line 209: Line 211:
handles the bleed-through between discovery and presence which is
handles the bleed-through between discovery and presence which is
inherent with some local discovery technologies. For example, if we
inherent with some local discovery technologies. For example, if we
are using [http://cerebro.mit.edu Cerebro] to find "fake friends" on
are using [http://cerebro.mit.edu Cerebro] to find strangers on
the local mesh, it will also provide basic presence information for
the local mesh, it will also provide basic presence information for
those friends in the same protocol messages. This information updates
those friends in the same protocol messages. This information updates
our local status behind the back of the presence algorithm, saving our
our local status behind the back of the presence algorithm, saving our
direct rate-limited queries for non-local friends outside the scope of
direct rate-limited queries for non-local friends outside the scope of
Cerebro. Similar arguments hold for "fake friends" discovered via a
Cerebro. Similar arguments hold for strangers discovered via a
school server or other mechanism. The key point is that all
school server or other mechanism. The key point is that all
hosts should support direct interrogation for presence, even if other
hosts should support direct interrogation for presence, even if other
Line 242: Line 244:
where:
where:
* '''name''' is a Punycode encoding of the XO nickname. Technically, the IDN ToASCII mapping operation is performed on the space-elided and punctuation-removed nickname, truncated on the right if necessary so the result is 63 characters or less; see http://en.wikipedia.org/wiki/Internationalized_domain_name.
* '''name''' is a Punycode encoding of the XO nickname. Technically, the IDN ToASCII mapping operation is performed on the space-elided and punctuation-removed nickname, truncated on the right if necessary so the result is 63 characters or less; see http://en.wikipedia.org/wiki/Internationalized_domain_name.
* '''xxx''' is an encoded version of the the XO public key. The number is written in a variable base number system where the first three digits are base 36, base 37, and base 26 and the digits are mapped into characters starting with lowercase alphabetic, then numeric, then a hyphen. If I've done my math correctly (see http://en.wikipedia.org/wiki/Birthday_paradox ), this requires about 220 students to have the same name before a collision has a 50% chance of occurring. If the server uses an independent means to prevent duplicate nicknames, the xxx can be replaced with '<tt>xo</tt>'.
* '''xxx''' is an encoded and truncated version of the the XO public key. If N is the (odd) numeric value of the public key, write the first three low-order digits of the quantity (N-1)/2 least-significant digit first in a variable base number system where the first digits are base 36, base 37, and base 26, respectively, and the digits are mapped into characters starting with lowercase alphabetic a-z, then numeric 0-9, then a hyphen. If I've done my math correctly (see http://en.wikipedia.org/wiki/Birthday_paradox ), this requires about 220 students to have the same name before a collision has a 50% chance of occurring. (If we used two characters instead of three, only 36 students would be required.) If the server uses an independent means to prevent duplicate nicknames, the xxx can be replaced with '<tt>xo</tt>'.
* <tt>school.country.xs.laptop.org</tt> is filled in by registration with a school server. If you do not have access to a school server, then you can register with <tt>xofriends.org</tt> or another independent service, which will provide an appropriate suffix. Pseudonyms can be generated with alternate suffixes via the same means.
* '''school.country.xs.laptop.org''' is filled in by registration with a school server. If you do not have access to a school server, then you can register with <tt>xofriends.org</tt> or another independent service, which will provide an appropriate suffix. Pseudonyms can be generated with alternate suffixes via the same means.

The three-character public-key-based interpolation into the domain name is present only to allow more reliable setup in the (temporary or permanent) absence of a school server. If we are using a centralized identity service, obviously the service can (and should) enforce uniqueness, and the interpolation is unnecessary. But we would like to be able to do initial setup of an XO with a known name for the school server but no school server actually present, allowing the school server to be added to the deployment later. Three characters seems like a reasonable price to pay for this flexibility. The school server can do very mild authentication based on these characters when it is first setup, but principally we are basic this string on the public key just to avoid introducing a new source of randomness.

If a school server detects a conflict -- two students named "Michael" in the same school with the same low-order bits of their public key -- the interface should prompt the child to choose a different nickname, as in [http://www.oreilly.com/catalog/bonjour/index.html zeroconf]. ''(I could be convinced that it should instead just choose a different random three-character disambiguation component, and we should discard the explicit link between this component and the public key. Discussion welcome!)''


It is easy to second-guess the choice of the Punycode mechanism used
It is easy to second-guess the choice of the Punycode mechanism used
by IDN; but there are serious advantages to using an existing and
by IDN; but there are serious advantages to using an existing and
deployed mechanism instead of rolling out own. Using IDN, our XO names are
deployed mechanism instead of rolling our own. Using IDN, our XO names are
vulnerable to
vulnerable to
[http://en.wikipedia.org/wiki/IDN_homograph_attack homoglyph attacks],
[http://en.wikipedia.org/wiki/IDN_homograph_attack homoglyph attacks],
but these will be mitigated by Bitfrost mechanisms: the combination of
but these are managed by Bitfrost mechanisms: the combination of
name and public key identifies a friend; having a similar name doesn't
name and public key identifies a friend; having a similar name doesn't
allow you to masquerade as another unless you also have their key
allow you to masquerade as another unless you also have their key
information.
information.

Note that we are not violating [https://zooko.com/distnames.html Zooko's Triangle]; this proposal follows closely his "strategy 4": our DNS names are "memorable" but they are not self-authenticating. As in SSH, our first introduction to a new buddy caches our buddy's public key. The DNS name, like the buddy name and color, is just a [http://www.skyhunter.com/marcs/petnames/IntroPetNames.html pet name]; authentication is performed by checking the public key. See [[#XO-to-XO Security]] for further discussion of transparent authentication.


=== Name resolution ===
=== Name resolution ===
Line 269: Line 277:
persist even if you roam off your current mesh.)
persist even if you roam off your current mesh.)


: ''See [[dnshash]] for a prototype implementation of this concept.'' --[[User:Mstone|Michael Stone]] 23:43, 15 July 2009 (UTC)
In the future we may implement additional DNS mechanisms, implementing

In the future we may implement additional DNS mechanisms, implemented
using the standard Name Service Switch mechanism of glibc. A variant
using the standard Name Service Switch mechanism of glibc. A variant
of mDNS is a possibility, although standard mDNS does not appear to
of mDNS is a possibility, although standard mDNS does not appear to
Line 276: Line 286:
use peer-to-peer services like [http://opendht.org/ OpenDHT] to
use peer-to-peer services like [http://opendht.org/ OpenDHT] to
implement wide-area serverless dynamic DNS. Again, the key property
implement wide-area serverless dynamic DNS. Again, the key property
is that the DNS abstraction is maintained; any alternate service
is that the DNS abstraction is maintained; any alternate service used will
provides the same interface to user code, and maps the same invariant
provide the same interface to user code and map the same invariant
domain names.
domain names.


Line 291: Line 301:


In the Browse activity, clicking on a link of this form would add this
In the Browse activity, clicking on a link of this form would add this
person to your buddy list. Communicating with a this form of buddy
person to your buddy list. Collaborating with this buddy
would resolve the domain name of the address according to the
would begin by resolving the domain name of the address according to the
mechanisms of the previous section, and then directly contact the
mechanisms of the previous section, and then contacting the
XMPP service at that address.
XMPP service at that address.


Friends are represented internally using the domain name only; there
Friends are represented internally as user@domain. When entering friends via manual keyboard input, specialized barcode reader, or other discovery mechanism, only the
domain name may be necessary, but maintaining the mostly-invariant 'xo@' portion of buddies internally allows more uniformity in our limited dealings with non-XO Jabber friends (for chat, for example).
is no "user@" portion. When entering friends via manual keyboard input,
specialized barcode reader, or other discovery mechanism, only the
domain name is necessary.


=== Presence service ===
=== Presence service ===
Line 322: Line 330:
extensions -- I should be able to friend and chat with a Google Talk
extensions -- I should be able to friend and chat with a Google Talk
user even if the Google servers support only standard XMPP.
user even if the Google servers support only standard XMPP.

== Extensions and improvements ==
In this section I outline a few additional pieces which can (a) connect groups of users behind NATs, (b) allow school children to maintain their DNS identity even if the schoolserver is deliberately inaccessible from their homes, and (c) leverage the indirection afforded by DNS to improve the end-to-end security of connections between XOs without affecting non-XO clients speaking legacy protocols. We also briefly discuss implementation issues for disconnected networks.

=== Tunnels ===
As discussed above, it is undesirable to attempt to automatically route around NATs or egress firewalls. However, in cases where that is explicitly desired, IPv6-over-IPv4 (6to4) tunnels are a logical & recommended means for doing so. For example, a school in Peru might partner with a "sister school" in Africa, establishing an IPv6 tunnel between their otherwise firewalled education networks. In this case this tunnel could be established between the Peru and Africa school servers.

A similar situation might occur when a child takes their laptop home, to a NAT'ed home network which doesn't allow collaboration. In this scenario, it is desirable to have the 6to4 client endpoint on the XO itself; the server endpoint might be on the machine hosting the XO's dynDNS entry. In the dynDNS/registration protocol, it seems reasonable to expect the host might return tunnel details for the XO to use.

Note that this is a deliberately _unambitious_ proposal. We are not attempting to provide globally-routable IPv6 addresses to every XO, or to automatically discover and utilize tunnels. The difficult realities of maintaining IPv6 network endpoints make this largely infeasible. Instead we are proposing a small extension to the DNS registration mechanism that allows creation of ad-hoc and globally disconnected IPv6 networks to transparently facilitate otherwise-difficult collaboration. Importantly, we are 'not' exposing the "NAT-busting" to the application level: we are just making the underlying standard network enclose slightly more nodes than it did previously.

=== External DNS ===
The use of alternate pseudonyms, expressed as differing DNS names, allows a child to continue collaborating at home even if their school server is firewalled off from home access. However, we expect that many countries will find it desirable to allow the kids to keep their "school identity" even while maintaining a closed school network.

We propose setting up a single "external DNS" server on the accessible internet to allow this. This DNS server reports itself as authoritative for the various school.country.xs.laptop.org domains to the outside world (although the school servers are still authoritative inside the school network) and allows kids to register their current network addresses to update their DNS mappings from home. Note that the XO's domain name will report different addresses inside and outside the school network; that's fine!

This lightweight "external DNS" server could also provide tunnel services, as described above, to better connect kids behind home NATs.

=== XO-to-XO Security ===
The indirection afforded by DNS allows us to use code like [ftp://ftp.porcupine.org/pub/security/index.html tcp_wrappers] to transparently authenticate XO hosts, without negatively affecting interoperability with legacy software and systems.

When a friend is added to our friends list, an XO-specific protocol is used to contact the machine and obtain its public key, as ssh does. If this succeeds, the friend is specially marked as an "XO friend".

A lookaside DNS service can be taught to consult our list of "XO friends" when doing a lookup. If the given domainname is for an XO friend, a localhost address can be returned instead of the "real" address. Connecting to the localhost address wraps each socket so that communication is encapsulated with SSL (checking the "server's" key and providing our key as a client certificate) before being forwarded on to a specific port at the "real" network address. A listener at that port performs the other side of the SSL connection before forwarding traffic to it's "true" destination at that host.

In this manner end-to-end authentication of XO's can be layered on top of the network without any modification of activity software or protocols. If you use a web browser to surf to http://myfriend.someschool.xs.laptop.org from your XO, the identity of your friend is transparently verified without teaching the web browser any new tricks. Because these wrappers only apply when the domain name corresponds to a known XO friend, surfing to (say) http://xkcd.com is unaffected.

This implements the "ssh model" of authentication, roughly. It is vulnerable to the same man-in-the-middle and key-compromise attacks ssh is; nevertheless this model has been shown to have significant practical value.

=== Disconnected operation ===
It is desirable to allow access to cached internet content even without access to the internet. A local caching HTTP proxy on the XO provides a simple solution, but many XOs will likely contain duplicate content. A peer-to-peer cache provides one alternative, but a transparent proxy on a school server can do better, since the school server contains much more available space.

However, XOs making web requests during disconnected operation will still attempt to resolve DNS names to addresses before initiating network requests. The school server can provide an "offline DNS cache" in the same way it provides an "offline HTTP cache", but in fact we can do better. In [[#Name_resolution|the name resolution section]] above we have already presented the seeds of an answer: XOs have an alternative name resolution mechanism which resolves DNS names to a link-local IPv6 address based on a hash of the DNS name in the absence of "authoritative" DNS service. The school server's DNS server can use this strategy to provide short lifetime non-link-local IPv6 addresses for DNS names in the absence of upstream DNS, which then allows it to intercept web requests for those addresses from XOs and properly serve them from its cache.

After connectivity is restored, the school server should route connections to the advertised link-local address to the "real" internet address of the host for the duration it advertised for the domain's mapping to the link-local address.


== Credits ==
== Credits ==
; Author
; Author
: C. Scott Ananian (cscott a t laptop.org)
: C. Scott Ananian (cscott a t laptop.org)

[[Category:Network]]
[[Category:Software ideas]]

Latest revision as of 23:18, 5 August 2013

  This page is monitored by the OLPC team.
The contents of this page are considered outdated and some of the information may be stale. Please use information here with caution, or update it.

OLPC's deployment strategy places laptops in many different network environments, with varying services available and levels of connectivity. In the first part of this document, we propose four basic principles for the software comprising our network stack. These principles divide our connectivity goals into functionality OLPC endeavors to provide and tasks that are the responsibility (or option) of the deployment. Further, they guide our software architecture, distinguishing the new services OLPC will design and implement from the traditional network services we will utilize and aggressively leverage. So far as possible, even the new services OLPC provides will attempt to reuse existing network protocols or userland APIs, and to remain compatible with other hosts on the inter-networks we create.

In the second part of this document, we dive more deeply into concrete proposals for network software architecture adhering to these principles. We propose a naming scheme for XOs and describe how it can be implemented using dynamic DNS. We then discuss protocols for friend and resource discovery, and describe how these can be effectively decoupled from the core XO software to allow community development of alternate directories and browsers.

Network Principles

We propose four basic principles underlying our network infrastructure and software architecture: disconnected local networks, direct peer communication, human-readable globally unique identifiers, and direct presence and status queries.

No Assumption of Universal Connectivity

We cannot assume universal connectivity between arbitrary XOs, or even connectivity between XOs and points on the global internet. Every deployment is a walled garden of some size; some deployments may endeavor to include the entire global internet inside that garden (network weather permitting); other deployments may include only a single school inside its walls; and in the limit case the "garden" may consist of only one or two XOs underneath a tree.

We will always endeavor to provide the best service possible within whatever walls we find ourselves. If we can only communicate with other XOs within a school, we will provide full collaboration within that school even though students cannot collaborate with other schools. If direct connectivity is possible between schools within a certain deployment, we fully support collaboration between schools, even if connectivity to the global internet is not available.

This principle also allows us to gracefully handle many kinds of network degradation, present in any real-world deployment. Even in the face of transient asymmetries or disconnections within the network we will provide the "best possible" behavior with the fullest achievable range of services. Central failures may interfere with collaboration among far-flung sites, but they should never interfere with the ability to use the XO locally.

Direct XO-to-XO serverless communication

Our basic networking features are built around direct XO-to-XO serverless (peer-to-peer) communication. Additional servers may be used as aides or proxies, but the canonical means to query the state of an XO or to collaborate with its user is to directly connect to it; no server need be involved.

By direct communication we mean the standard socket API and IP protocols on which the internet is built. We are not building an overlay network or providing bespoke routing or addressing services; we are using the "plain old network". Auxiliary servers, tunnels, or services may be utilized, but only on an "as available" basis or transparently (as in the case of routers or tunnels) consistent with the end-to-end principle.

There are three main ways in which direct XO-to-XO communication fails:

NATs or firewalls are used for control.
In some deployments, XOs inside a school are placed behind a firewall or NAT to enforce web filtering or other control mechanisms. This can be unfortunate, but OLPC will not attempt to subvert this. We acknowledge that these mechanisms may create walled gardens, but that decision is the responsibility of the deployment. If the school wants to provide collaboration with other schools or individuals outside its walled garden, it is the deployment's responsibility to poke holes in its walls, not OLPC's.
NATs used due to limited IPv4 address space.
In some deployments, NATs or firewalls are used to work around limits in the number of available globally-routable IPv4 addresses. IPv6 is an excellent solution to this problem, but IPv6 deployment is quite poor at present. OLPC cannot single-handedly deploy IPv6 to every school endpoint; it is the responsibility of the schools or deployments to provide IPv6 connectivity, via 6-to-4 or tunneling. Again, walled gardens may result from the limited IPv4 address space; if collaboration past the NAT is desired, deployments must provide the IPv4 or IPv6 connectivity required to enable this. (OLPC will probably endeavor to assemble a best-of-breed IPv6 tunnel solution as an aid to deployment teams, but locating, installing, and maintaining the IPv6 endpoint is the responsibility of the deployment.)
NATs or firewalls used for other reasons.
In some instances NATs or firewalls may be imposed externally due to circumstances beyond direct control, and are not actually desired by the deployment. Again, the IPv4 or IPv6 tunneling needed to bypass this is the responsibility of the deployment. We will make an effort to use http/port 80 for services to provide "best possible operation" when reasonable.

By relying on IPv4/IPv6 routability, deployments can improve the their ability to collaborate independent of changes to the software on the XO.

Human-readable unique identifiers for each XO

Direct communication implies that there is a usable address for each XO. We insist that this address be (a) human-readable, (b) invariant and indirect, and (c) globally unique.

Human-readable names promote compatibility with other network hosts: the addressing information used to connect to an XO via ssh or a chat client from a non-XO host should be concise and logical ("memorable" using Marc Stiegler's terminology). As a counter-example, we will state that using a 256-bit hash of a public key as the primary identification mechanism is not acceptable; even using a scheme convert binary information into quasi-readable word sequences does not yield an acceptably concise or logical address. An address for student 'Scott' at a school in Cambridge, MA, USA, should look something like scott.1cc-cambridge-ma.us.xs.laptop.org; we will describe a concrete proposal below. A key detail here is that this name identifies the XO, not the school server or some other helper. This allows direct communication consistent with the previous principle.

Although the name is a direct reference to the machine, the mapping from name to routable address is indirect. There may be several means to map the name to an address or service on the machine, and some of these means might take advantage of proxies or servers. The key property is that there is a single name by which anyone anywhere on the network uses to refer to a particular XO -- the names do not depend on the means by which the name is mapped to an address, route or service. Because of the walled garden reality, not everyone will be able to successfully map a name to a usable address, route, or service at all times --- but when network conditions change (the student goes home, school connectivity is restored, etc) the invariant name will be usable. Use of a shortname like 'scott' for link-local communications violates this principle.

Globally-unique names ensure freedom of conflicts when walled gardens merge or students go home to a different network environment. This implies some central coordination, but not much: if deployments are assigned unique prefixes by OLPC, and schools are assigned unique names by the deployment, then we need only ensure that (for example) students are assigned unique names within the school. This can either be performed probabilistically in the absence of a school server (see proposal below) or by registration with a school server. Collision-detection (like in Apple's zeroconf/Bonjour) is a big help in practice, allowing the uniqueness constraint to be enforced "on-demand" when necessary.

The strong constraints on identifiers proposed in this section have privacy implications, which we propose to address using a capable pseudonym system. We only insist that there is at least one name for each XO. To preserve privacy we insist that there may be many such names. The name scott.1cc-cambridge-ma.us.xs.laptop.org may be the official name used for schoolwork, but the child should be able to also setup freedom76.olpc.cypherpunks.org (for example) as a pseudonym for extracurricular work. In this example, the cypherpunks infrastructure could provide additional privacy using protected proxies, onion routing, and other features to whatever degree desired. Ideally, selection among numerous pseudonyms would be made easily on the home screen of the Sugar interface.

As a practical matter, note that the requirements outlined here match standard DNS very well. There are A and AAAA records to map the invariant names indirectly to addresses, as well as extensible record types (DNS-SD) to map the names to other services. Domain names are designed to be human-readable, and there are existing standards to extend DNS to non-ASCII character sets. The delegation mechanisms for subdomains are appropriate delegation mechanisms to give countries control of their namespaces. There may be multiple mechanisms to perform DNS services for the user -- selecting from different available servers and protocols as well as variants like mDNS -- but we believe that the DNS API is the right abstraction for this task.

Direct presence interrogation

This document proposes a strict separation between discovery and presence mechanisms. Discovery concerns the mechanisms used to find collaboration partners, other students in your classroom, or the teacher's XO in order to participate in guided classwork. A key principle is that discovery is unconstrained. We will attempt to provide best-of-breed mechanisms for discovering other local hosts, but in practice most non-local discovery needs to piggyback on social mechanisms: XO names posted using Orkut or other social networking sites, added to local or global purpose-specific wikis or static web pages, or communicated over existing IM or email networks. The concise and human readable names we assign to XOs aid their spread using these existing mechanisms, and we anticipate a variety of third-party "friend finder" activities which will expand the discovery possibilities further. One such might list the local XOs whose users are interested in chess, for example, as a way to find game partners when traveling to new network environments.

Presence is the means by which the current status of these discovered friends or resources is ascertained. Although names are invariant, there is no guarantee of connectivity with a particular friend at any given time, and even if the friend is accessible, they may be playing checkers rather than chess right now, for example.

The canonical presence mechanism is direct: one XO connects directly to a service running on the other and queries for its status. Although the number of possible links in a network grows according to the square of the number of nodes, the interconnectedness of real social networks is quite limited: most users have 20 or so friends, with a few "super nodes" having 100 or so. Directly querying "real friends" should not be expensive in bandwidth or time.

That said, our existing presence mechanisms also attempt to determine the presence of a potentially unbounded number of strangers, who share the same local network or school server but who may not be actually known to the user. To manage this case, we strictly limit the rate and size of presence queries. A suggested limit is 1 query per second, with 1kB maximum query and response sizes. This limit ensures that overall traffic scales reasonably in the worst case even as the number of potential strangers on the network increases.

We suggest the following basic presence algorithm: first, find the friend whose status information is "most out of date", favoring real friends over strangers according to some weight mechanism. Directly query that out-of-date friend, updating the "last checked" information in their status even if the query fails. Wait an appropriate amount of time to satisfy the rate limit, and repeat.

We have formulated the above algorithm so that it transparently handles the bleed-through between discovery and presence which is inherent with some local discovery technologies. For example, if we are using Cerebro to find strangers on the local mesh, it will also provide basic presence information for those friends in the same protocol messages. This information updates our local status behind the back of the presence algorithm, saving our direct rate-limited queries for non-local friends outside the scope of Cerebro. Similar arguments hold for strangers discovered via a school server or other mechanism. The key point is that all hosts should support direct interrogation for presence, even if other efficient mechanisms are used for partial aggregate presence in some situations. Below we will propose using a lightweight XMPP server on the XO to provide a standardized direct presence service which interoperates well with the "buddy presence" mechanisms used by existing Jabber-compatible IM clients, like Pidgin and Google Talk.

The purpose of discovery and presence information is fundamentally to enable collaboration. Our principles above dictate that collaboration mechanisms are built using direct peer-to-peer communication. There may be several such collaboration technologies we deploy, including direct socket connection for legacy applications to non-XO hosts, wrapped sockets for legacy protocol communications to XO hosts (in which case we can provide stronger end-to-end authentication and security), as well as multi-pointer X, Tubes, RPC mechanisms, and other related technologies.

An architecture proposal

DNS names

XOs are identified using DNS names of the form:

name.xxx.school.country.xs.laptop.org

where:

  • name is a Punycode encoding of the XO nickname. Technically, the IDN ToASCII mapping operation is performed on the space-elided and punctuation-removed nickname, truncated on the right if necessary so the result is 63 characters or less; see http://en.wikipedia.org/wiki/Internationalized_domain_name.
  • xxx is an encoded and truncated version of the the XO public key. If N is the (odd) numeric value of the public key, write the first three low-order digits of the quantity (N-1)/2 least-significant digit first in a variable base number system where the first digits are base 36, base 37, and base 26, respectively, and the digits are mapped into characters starting with lowercase alphabetic a-z, then numeric 0-9, then a hyphen. If I've done my math correctly (see http://en.wikipedia.org/wiki/Birthday_paradox ), this requires about 220 students to have the same name before a collision has a 50% chance of occurring. (If we used two characters instead of three, only 36 students would be required.) If the server uses an independent means to prevent duplicate nicknames, the xxx can be replaced with 'xo'.
  • school.country.xs.laptop.org is filled in by registration with a school server. If you do not have access to a school server, then you can register with xofriends.org or another independent service, which will provide an appropriate suffix. Pseudonyms can be generated with alternate suffixes via the same means.

The three-character public-key-based interpolation into the domain name is present only to allow more reliable setup in the (temporary or permanent) absence of a school server. If we are using a centralized identity service, obviously the service can (and should) enforce uniqueness, and the interpolation is unnecessary. But we would like to be able to do initial setup of an XO with a known name for the school server but no school server actually present, allowing the school server to be added to the deployment later. Three characters seems like a reasonable price to pay for this flexibility. The school server can do very mild authentication based on these characters when it is first setup, but principally we are basic this string on the public key just to avoid introducing a new source of randomness.

If a school server detects a conflict -- two students named "Michael" in the same school with the same low-order bits of their public key -- the interface should prompt the child to choose a different nickname, as in zeroconf. (I could be convinced that it should instead just choose a different random three-character disambiguation component, and we should discard the explicit link between this component and the public key. Discussion welcome!)

It is easy to second-guess the choice of the Punycode mechanism used by IDN; but there are serious advantages to using an existing and deployed mechanism instead of rolling our own. Using IDN, our XO names are vulnerable to homoglyph attacks, but these are managed by Bitfrost mechanisms: the combination of name and public key identifies a friend; having a similar name doesn't allow you to masquerade as another unless you also have their key information.

Note that we are not violating Zooko's Triangle; this proposal follows closely his "strategy 4": our DNS names are "memorable" but they are not self-authenticating. As in SSH, our first introduction to a new buddy caches our buddy's public key. The DNS name, like the buddy name and color, is just a pet name; authentication is performed by checking the public key. See #XO-to-XO Security for further discussion of transparent authentication.

Name resolution

Standard dynamic DNS mechanisms can be used to keep the DNS for name.xxx.school.country.xs.laptop.org up to date, via a NetworkManager hook which updates the school server. This is sufficient for deployments with a school server.

If a school server is unavailable, we can provide a resolver which will form an IPv6 link-local address from the lower 64 bits of the SHA-256 of the domain name. This provides "serverless" link-local resolution of friends. (If a true DNS server responds, you SHOULD use that address for further communication in this session, since it may persist even if you roam off your current mesh.)

See dnshash for a prototype implementation of this concept. --Michael Stone 23:43, 15 July 2009 (UTC)

In the future we may implement additional DNS mechanisms, implemented using the standard Name Service Switch mechanism of glibc. A variant of mDNS is a possibility, although standard mDNS does not appear to behave well on wireless mesh networks and is explicitly limited to domains under .local. One can also imagine variants which use peer-to-peer services like OpenDHT to implement wide-area serverless dynamic DNS. Again, the key property is that the DNS abstraction is maintained; any alternate service used will provide the same interface to user code and map the same invariant domain names.

Friend links in HTML

Rather invent a new URI scheme for XO friends, we propose to use the standard XMPP scheme:

xmpp:xo@nickname.xxx.school.country.xs.laptop.org?roster;name=Full%20Name

The "Full Name" in this scheme refers to the actual unicode XO name specified by the child, which may differ from abbreviated and encoded version used in the domain name string.

In the Browse activity, clicking on a link of this form would add this person to your buddy list. Collaborating with this buddy would begin by resolving the domain name of the address according to the mechanisms of the previous section, and then contacting the XMPP service at that address.

Friends are represented internally as user@domain. When entering friends via manual keyboard input, specialized barcode reader, or other discovery mechanism, only the domain name may be necessary, but maintaining the mostly-invariant 'xo@' portion of buddies internally allows more uniformity in our limited dealings with non-XO Jabber friends (for chat, for example).

Presence service

The XMPP server on the laptop responds to roster and chat requests like a normal Jabber client, so that we interoperate with iChat, Google Talk, Pigin, etc. When a school server is present, it MAY publish a SRV record for

_xmpp-server._tcp.nickname.xxx.school.country.xs.laptop.org 

specifying that it is handling XMPP requests for that user, according to RFC 3921. This delegation allows the school server to proxy requests in some cases, for example if the school server is directly addressable from outside a school but the internal mesh network of XOs is not.

Additional presence information specific to the XO, for example enumeration of sharable activities, is exported using a simple service at a well-known port. This could potentially be performed with compatible extensions to XMPP, other XMPP usernames, or it could involve a simple http or other service. The XO software should interoperate well with buddies who do not support the XO-specific extensions -- I should be able to friend and chat with a Google Talk user even if the Google servers support only standard XMPP.

Extensions and improvements

In this section I outline a few additional pieces which can (a) connect groups of users behind NATs, (b) allow school children to maintain their DNS identity even if the schoolserver is deliberately inaccessible from their homes, and (c) leverage the indirection afforded by DNS to improve the end-to-end security of connections between XOs without affecting non-XO clients speaking legacy protocols. We also briefly discuss implementation issues for disconnected networks.

Tunnels

As discussed above, it is undesirable to attempt to automatically route around NATs or egress firewalls. However, in cases where that is explicitly desired, IPv6-over-IPv4 (6to4) tunnels are a logical & recommended means for doing so. For example, a school in Peru might partner with a "sister school" in Africa, establishing an IPv6 tunnel between their otherwise firewalled education networks. In this case this tunnel could be established between the Peru and Africa school servers.

A similar situation might occur when a child takes their laptop home, to a NAT'ed home network which doesn't allow collaboration. In this scenario, it is desirable to have the 6to4 client endpoint on the XO itself; the server endpoint might be on the machine hosting the XO's dynDNS entry. In the dynDNS/registration protocol, it seems reasonable to expect the host might return tunnel details for the XO to use.

Note that this is a deliberately _unambitious_ proposal. We are not attempting to provide globally-routable IPv6 addresses to every XO, or to automatically discover and utilize tunnels. The difficult realities of maintaining IPv6 network endpoints make this largely infeasible. Instead we are proposing a small extension to the DNS registration mechanism that allows creation of ad-hoc and globally disconnected IPv6 networks to transparently facilitate otherwise-difficult collaboration. Importantly, we are 'not' exposing the "NAT-busting" to the application level: we are just making the underlying standard network enclose slightly more nodes than it did previously.

External DNS

The use of alternate pseudonyms, expressed as differing DNS names, allows a child to continue collaborating at home even if their school server is firewalled off from home access. However, we expect that many countries will find it desirable to allow the kids to keep their "school identity" even while maintaining a closed school network.

We propose setting up a single "external DNS" server on the accessible internet to allow this. This DNS server reports itself as authoritative for the various school.country.xs.laptop.org domains to the outside world (although the school servers are still authoritative inside the school network) and allows kids to register their current network addresses to update their DNS mappings from home. Note that the XO's domain name will report different addresses inside and outside the school network; that's fine!

This lightweight "external DNS" server could also provide tunnel services, as described above, to better connect kids behind home NATs.

XO-to-XO Security

The indirection afforded by DNS allows us to use code like tcp_wrappers to transparently authenticate XO hosts, without negatively affecting interoperability with legacy software and systems.

When a friend is added to our friends list, an XO-specific protocol is used to contact the machine and obtain its public key, as ssh does. If this succeeds, the friend is specially marked as an "XO friend".

A lookaside DNS service can be taught to consult our list of "XO friends" when doing a lookup. If the given domainname is for an XO friend, a localhost address can be returned instead of the "real" address. Connecting to the localhost address wraps each socket so that communication is encapsulated with SSL (checking the "server's" key and providing our key as a client certificate) before being forwarded on to a specific port at the "real" network address. A listener at that port performs the other side of the SSL connection before forwarding traffic to it's "true" destination at that host.

In this manner end-to-end authentication of XO's can be layered on top of the network without any modification of activity software or protocols. If you use a web browser to surf to http://myfriend.someschool.xs.laptop.org from your XO, the identity of your friend is transparently verified without teaching the web browser any new tricks. Because these wrappers only apply when the domain name corresponds to a known XO friend, surfing to (say) http://xkcd.com is unaffected.

This implements the "ssh model" of authentication, roughly. It is vulnerable to the same man-in-the-middle and key-compromise attacks ssh is; nevertheless this model has been shown to have significant practical value.

Disconnected operation

It is desirable to allow access to cached internet content even without access to the internet. A local caching HTTP proxy on the XO provides a simple solution, but many XOs will likely contain duplicate content. A peer-to-peer cache provides one alternative, but a transparent proxy on a school server can do better, since the school server contains much more available space.

However, XOs making web requests during disconnected operation will still attempt to resolve DNS names to addresses before initiating network requests. The school server can provide an "offline DNS cache" in the same way it provides an "offline HTTP cache", but in fact we can do better. In the name resolution section above we have already presented the seeds of an answer: XOs have an alternative name resolution mechanism which resolves DNS names to a link-local IPv6 address based on a hash of the DNS name in the absence of "authoritative" DNS service. The school server's DNS server can use this strategy to provide short lifetime non-link-local IPv6 addresses for DNS names in the absence of upstream DNS, which then allows it to intercept web requests for those addresses from XOs and properly serve them from its cache.

After connectivity is restored, the school server should route connections to the advertised link-local address to the "real" internet address of the host for the duration it advertised for the domain's mapping to the link-local address.

Credits

Author
C. Scott Ananian (cscott a t laptop.org)