Jump to: navigation, search


This document proposes a design for networking based on previously realized Network Principles. It then explores and elaborates the design with analysis, example configuration, and experimental results after which it concludes by crediting those who have contributed to the design and by explaining future work inspired by current results.


Its abstract purpose is to advance the Network Principles project by explaining how you might build a system based on those principles with currently available tools and by doing a first round of modeling and prototyping in order to gain some analytic and empirical evidence about whether those principles are sound.

Its concrete purpose is to provide internetworking and naming technology to XO-users (and other interested parties) that seamlessly and predictably supports the XO's most important low-latency network scenarios as well as is possible with existing software.

Its social and conceptual purpose is to provide a design that is satisfactory in several ways in which previous networking and collaboration substrates were not, as described in the following principles of design quality:

do no harm -- our users are not volunteers, so don't waste their time
play well with others, since we want a large ecosystem and lots of testing
be realistic, so that we don't promise the impossible
be predictable, so that we can tell people what will work and what will fail in advance
prevent failure, by means of proof, simulation, and wise habits
tolerate failure, by removing inappropriate single points of failure
route around failure, by means of self-test procedures and preplanned maneuvers (manual overrides)

Notes on quality principles:

  1. do no harm and play well with others mean that we believe that previous designs unnecessarily harmed their users by wasting scarce resources (e.g. time, trust, and the capacity to learn) and opportunities (e.g. to grow the ecosystem). Other "harms" inflicted by previous designs include having badly confused their users, having over-promised their scalability, and having been unable to articulate or to meet basic "go/no-go" availability requirements.
  2. realism and predictability are intended to evoke the following "litmus test" questions:
    • how well does the design conform to the physical realities (bandwidth, latency, power, failure, and error) and to the social realities (ignorance, interdiction, authority, and autonomy) that define its niche?
    • is there a public, written, and peer-reviewed design document describing the design?
  3. prevent, tolerate, and route around are all direct usability goals that no networking design intended for real humans (particularly by teachers!) should ignore



Prerequisite concepts: scenario, network, medium, link, bridge, router, tunnel, internetwork

We want to offer maximally efficient and robust support for our ideal network scenarios (nos. 1 and 9, denoted with bold text, below) while offering seamless support for optional network enhancements like fancy links, routers, tunnel endpoints, and transit agreements that may be provided by the surrounding ecosystem of deployment organizations, universities, individuals, and commercial entities.

Network Scenarios:

  1. access to at least one shared-media link, like an 802.3 hub or the 2.4 GHz radio band
  2. a more efficient link, like an 802.3 switch or an 802.11 access point
  3. a bridge, like an XS or a good access point, between two or more otherwise separate single-link networks
  4. a local router, like an XS, routing between two or more otherwise separate (but potentially complicated) local networks
  5. a restrictive local router which provides some IPv4 connectivity but which drops IPv6 traffic
  6. credentials for some sort of dedicated local tunnel endpoint (like a SOCKS proxy or an HTTP proxy)
  7. a remote router offering us some sort of access to a larger internetwork, typically via (perhaps restricted) IPv4
  8. credentials for some sort of dedicated remote tunnel endpoint (like an SSL or IPsec VPN or a 6to4 tunnel, etc.)
  9. a remote router offering great access to a larger internetwork, like the IPv4 or IPv6 Internet


Prerequisite concepts: link, layer, network, address, internetwork, name

Based on these scenarios, we imagine our network as being organized into three kinds of composable layers:

  1. a link layer, usually implemented via 802.3 wired Ethernet, 802.11b/g wifi in either ad-hoc or infrastructure mode, or various sorts of tunneling over IPv4, perhaps across NATs and firewalls,
  2. an internetworking layer, based on IPv6 (tutorial documentation), and
  3. a naming layer, based on DNS, for binding logical addresses from networks with different failure modes to stable human-memorable names

We find this layered conceptual model helpful for estimating dependency ("what has to work before this layer can work?") and cost ("what does it cost to traverse this layer?").


IPv6 Configuration

Prerequisite concepts: peer, server, IPv6, link, router, prefix, tunnel, dnshash


Your job is to be an IPv6 node. Consequently, when you bring up your interfaces,

  1. You might discover an IPv6 router advertising on one of your links.
    • (See sysctl net.ipv6.conf.all.accept_ra and related variables.)
  2. You might try out dhcp6c.
  3. You might have some kind of IPv4 connectivity. If so, connect to the Internet or to other internetworks of your choice.
    • (miredo and openvpn seem particularly easy to configure and hence to experiment with...)
  4. Use dnshash to add guessable link-local addresses to all your interfaces.


Your job is to be an IPv6 router and a DNS server. One of several situations might obtain:

  1. You might discover an IPv6 router advertising one or more IPv6 prefixes on your outbound link(s).
  2. You might have some kind of IPv4 connectivity. If so, connect to the Internet or to other internetworks of your choice.
  3. You might be under a tree. If so, generate a Unique Local Address prefix.
  4. (Use dnshash to add guessable link-local addresses to all your links?)

When done, use radvd or dhcp6d to share addresses.

DNS Configuration

Prerequisite concepts: server, discovery, presence, authentication

One of the server's most important jobs is to get itself on appropriate internetworks so that it can dynamically map stable (DNS) names to unstable names (IPv6 addresses) for itself and its peers.


Peers need help locating one or more DNS servers. See RFC 4339 for available mechanisms; pay particular attention to RDNSS discovery.


Here are two approaches for solving the update problem, based on how peers might want to communicate with DNS servers:

  1. Use a DNS UPDATE client like ipcheck or ddclient with shared keys with a DNS server like BIND.
  2. Run a bespoke control protocol over an existing secure tunnel, e.g. something based on with XML-RPC over HTTPS + client certs or on access to a restricted shell over SSH.

(NB: In order to perform this update, it will usually have been necessary for the peer to have been cryptographically introduced to the server.)

Security Ideas

Prerequisite concepts: spoofing, petname, authentication, confidentiality, integrity, availability, DNS resolver, DNS nameserver, dnscurve, security association, asymmetric cryptography

This optional section is included merely to offer some hints about where we think communications security ought to be headed.

  1. Spoofing, Integrity, Confidentiality. See communications security and petnames for some background. A very rough road along which something reasonable might lie:
    • Use physical introduction to CNAME to <key>
    • Then, my dnscurve-compatible DNS resolver will refuse to give me addresses unless the nameserver I contact for cscott proves knowledge of cscott's private key.
    • Then I have a nice basis with which to configure IPsec security associations.
  2. System Integrity
  3. DoS


Cost Model

Prerequisite concepts: bandwidth, latency, jitter, availability, model, unicast, multicast, broadcast, network stack

First, some baseline analysis:

Suppose we have a wireless link with capacity C.
Suppose we have N nodes.
Suppose each node n wants to maintain f(n) connections.
If f(n) = 1 then we could allocate up to C/N per connection.
If f(n) = N then we could allocate up to C/N^2 per connection.

Instructive values: C=30 Mbps, N=40, f(n)=N ==> 19 Kbps / conn. Conclusion: beware O(N^2) behavior.

Several important numbers that we need to predict and to measure include bandwidth and latency figures:

tx == transmit, rx == receive, btx == broadcast

btx/tx/rx - ICMPv6+IPv6+phys           - router discovery (RD)
btx/rx    - ICMPv6+IPv6+phys           - duplicate address detection (DAD)
tx/rx     - ICMPv6+IPv6+phys           - NS neighbor discovery (ND)
tx/rx     - UDP+IPv6+phys              - DNS query
tx/rx     - JSON+SSH+TCP+IPv6+phys     - DNS update

where "phys" describes the equations' dependence on the "physical" layer's 
frame overhead and MTU

notable "phys" layers:

Ethernet           -- ad-hoc wifi, infra wifi, 802.11s mesh, switch, hub
TLS+UDP+IPv4       -- openvpn
L2TP+IPsec+IPv4    -- raccoon, isakmpd, openswan, etc.
UDP+IPv4           -- teredo

Baseline overheads:

Ethernet: 18
IPv4: 20 + options
IPv6: 40 + options
ICMPv6: 4
ICMPv6 RA: 16 + prefix+{32} + mtu?{8} 
UDP: 4
TCP: 20 + options?
TLS: 5 + mac?{16,20,32} + pad?{4,8,16}
D-Bus: 12 + type-array
XMPP MUC: 50 + jids

Diagnosis Tools

Prerequisite concepts: diagnostic, link, address, name, route, socket, packet log, dig, ping, traceroute, netcat, bandwidth test, log bundle

Start recording a typescript so that we can see what you did.

mkdir -p $TESTDIR && cd TESTDIR
ulimit -c unlimited

Check that you've got the right DNS name for the person you want to talk to.

echo $NAME > peer

Dump your addresses, routes, and perhaps your open connections.

hostname --fqdn | tee host
ip addr show | tee addrs
ip route show | tee ipv4_routes
ip -6 route show | tee ipv6_routes
netstat -anp | tee conns

If you have wireless devices,

iwconfig | tee iwconfig
iwlist scan | tee iwlist_scan

Fire up tcpdump:

tcpdump -w packets -s0 &

Resolve that name to addresses. Check that the addresses seem sane.

dnshash lookup $NAME | tee peer_addrs_dnshash
dig $NAME | tee peer_addrs_dig

See who's answering broadcasts:

ping6 -I $IFACE ff02::1

Route to the addresses:

ping6 -I $IFACE $ADDR | tee ping
traceroute6 $ADDR | tee traceroute
tracepath6 $ADDR | tee tracepath

Connect to the address:

# echo "SSH-2.0-Hi" | nc6 $ADDR 22
# printf "GET / HTTP/1.0\r\n\r\n" | nc6 $ADDR 80
# ssh $ADDR
# curl -I http://$ADDR/
# ...

Conduct a bandwidth test:

iperf -c -V $ADDR

Collect logs from your application and send them to developers:

kill -SIGINT %1
cd ..
tar c $TESTDIR | lzma -c > logs.tar.lzma

Self-Test Algorithm

In order for things to "just work", there are many subgoals that need to be satisfied. The purpose of the self-test algorithm is to speed up debugging by quickly and reliably identifying subgoals whose named requirements are satisfied but whose characteristic test fails.

The form of the self-test algorithm will be a decision-list which may, in the future, be incorporated into software.

A rough outline of that decision list is:

Do we have all the network interfaces that we should?
Is each interface attached to a link?
Does each interface have a link-local address?

Is every interface able to ping itself?
Does link-layer broadcast return responses?
Does network-layer broadcast return responses?

# assuming that we have a partner on the same link
Can we ping our partner?
Can we hear our partner pinging us?
Does there seem to be reasonable bandwidth on our link?

# assuming we have a link-local partner with a name
Do we and our partner have byte-identical names written down?
Can we both resolve the name to a link-local address?
Do we get the same address?
Can we both ping the address?
Can I connect to a service running at the address (e.g. ssh)

# assuming that we have a router
Can we ping our router?
Can we traceroute someone upstream of the router?



Link-local configuration

Try out dnshash on an isolated access point, ad-hoc network, switch, or hub.

Observations: very pleasant!

VPN server configuration

In this experiment, we're going to configure openvpn and radvd on a machine ( with a public IPv4 address. Truthfully, this combination is probably overkill, but the task of constructing it seemed like it might to offer valuable experience, e.g. for someone who wants to bridge multiple kinds of tunnel endpoint or who wants to load-balance lots of peers between a couple of endpoints.

# Install our VPN and route advertisement software.
apt-get install openvpn radvd
# yum -y install openvpn radvd
# add nobody:nobody
groupadd nobody
useradd nobody
usermod -a -G nobody nobody

# Configure radvd
cat > /etc/radvd.conf <<EOF
interface tap0
        AdvSendAdvert on;
        MinRtrAdvInterval 30;
        MaxRtrAdvInterval 100;
        prefix 1234:db8:1:0::/64
                AdvOnLink on;

# enable forwarding everywhere
sysctl -w net.ipv6.conf.all.forwarding=1

# flush the forwarding table
ip6tables -F FORWARD

# really, I /want/ a multi-user version of
# openvpn --dev tap --user nobody --group nobody --verb 6
# but I'm not sure how to get that. instead, I'll use some fake keys and no ciphers.
mkdir -P keys && cd keys
tar xf sample-keys.tar.bz2 && cd sample-keys

# create a multi-user tunnel
openvpn --mode server --client-to-client --dev tap --user nobody --group nobody --verb 6 --opt-verify --tls-server --client-connect /bin/true --auth-user-pass-optional --duplicate-cn --auth-user-pass-verify /bin/true via-env --dh ./dh1024.pem --ca ./ca.crt --cert client.crt  --key client.key --script-security 3 --auth none --cipher none &

# at any rate, bring up the interface so that we get link-local addresses
ip link set tap0 up

# turn on the route advertisement daemon
radvd -d 5 -m stderr &

VPN client configuration

The purpose of this experiment was to test the VPN configuration described immediately above.

# install vpn client
apt-get install openvpn
# yum -y install openvpn

# add nobody:nobody
groupadd nobody
useradd nobody
usermod -a -G nobody nobody

# download fake keys.
mkdir -P keys && cd keys
tar xf sample-keys.tar.bz2 && cd sample-keys

# connect to the vpn
openvpn --user nobody --group nobody --dev tap --remote --tls-client --ca ca.crt --cert ./client.crt --key client.key --auth none --cipher none &

# bring up the interface
ip link set tap0 up

# find other people
ping6 -I tap0 ff02::1

# if using dnshash, attach
dnshash attach <your>.<domain>.<name>

# ... test, as described above ...


  • TLS imposes a high latency cost, even with null algorithms.
  • TAP devices work rather nicely, at least for tiny networks.
  • Be careful of firewall rules!
  • radvd is perhaps unnecessary with a single virtual ethernet -- dnshash "suffices" -- though it might be useful for routing between several load-balanced ethernets.
  • The default IP sorting rules and route priorities mean that it may take a long time for a connecting app like ssh or nc6 to connect to the /correct/ dnshash address.

6to4: Hurricane Electric

It turned out to be extremely easy to register for a tunnel endpoint with Hurricane Electric's free registration form. My prefix was assigned in minutes and HE provided excellent instructions on how to configure my end of the tunnel.

(Unfortunately, my Linksys WRT54G router will not forward IPv4 protocol #41 traffic (IPv6 in IPv4) without free firmware like DD-WRT.)

Update: 6to4 via Hurricane Electric seems to work really well after flashing #OpenWRT onto the router!

6to4: sixxs

Sixxs also promptly approved my account request. More information to follow soon.

Workable DynDNS

There seems to be no widely-adopted standard API for (remotely) modifying DNS zone files. For example, the standardized DNS UPDATE protocols defined by RFCs 2136 and 3007 seem to be sparsely implemented at best. Other approaches, like draft-jennings-app-dns-update-02 have not been standardized. Finally, there are open problems with truth maintenance as described in other unstandardized work draft-sekar-dns-ul-01.

So what are our real options?

The simplest thing that could possibly work would be to SSH or SSL to the DNS server we want to update. A successful SSH or SSL authentication binds together a username or client CN (which identifies the subdomain to update) and an IP address which we can use to generate the new RRset for that subdomain.

This will work well so long as we can commit up-front to an address and port number for our "olpcdyndns" server listen on. Unfortunately, it seems likely that large-scale providers of this olpcdyndns service will want to be able to provide service to multiple independent domains from a single machine, e.g. via vhosting.

To support vhosting, we need a way to communicate address/port information from the server to the client (for availability) and from the client back to the server (for integrity).

The server-to-client communication may be handled without undue difficulty by using DNS-SD to inform clients what port to connect to.

In the simplest case, suppose that we want to set up DNS-SD for a fixed instance named "primary" at olpcdyndns host <foo>.

In that case, we can use a single SRV record with priority 0, weight 0, zone


and whatever hostname and port we like to point to our real olpcdyndns server.

On the client, we can extract the specified host and port with

SRV=$(dig +short -t srv primary._olpcdyndns1._tcp.<foo>)
PORT=$(echo "$SRV" | cut -d' ' -f3)
HOST=$(echo "$SRV" | cut -d' ' -f4-)

Auxiliary information, if we had any, could be acquired via

TXT=$(dig +short -t txt primary._olpcdyndns1._tcp.<foo>)

If you want to get fancy, you could also loop over _olpcdyndns services with something like:

for PTR in $(dig +short -t ptr _olpcdyndns1._tcp.<foo>); do
  SRV=$(dig +short -t srv "$SRV")

Next, what should we run on this carefully communicated host+port combination?

Depending on our preference, we can either use

 ssh -p $PORT $HOST /usr/bin/olpc-dyndns-1-ssh-update

or we can use SSL with SNI like so: (with openssl >= 0.9.8j)

 openssl s_client -connect $HOST:$PORT -servername <foo> -cert <my_cert> -key <my_key>

to trigger /usr/bin/olpc-dyndns-1-ssl-update through something like stunnel or ucspi-ssl's sslserver.

The mythical olpc-dyndns-1-ssh-update can read SSH_CONNECTION to find out the connecting IP; the mythical olpc-dyndns-1-ssl-update can read REMOTE_HOST and SSL_CLIENT_DN (with stunnel) or the sslserver equivalents.



in /etc/ssh/sshd_config will cause SSH to log key fingerprints as well as accounts in case you want to manage everything from a single account. There doesn't seem to be any way (at present) to find out the key fingerprint of an active SSH session except by log-munging. (grr!).


Going with openssl:

openssl genrsa -out ca.key 1024
openssl req -new -x509 -nodes -sha1 -days 9999 -key ca.key -out ca.cert
cat ca.cert ca.key > ca.pem

openssl genrsa -out client.key 1024
openssl req -new -nodes -sha1 -days 9999 -key client.key -out client.csr

openssl x509 -req -in client.csr -out client.cert -CA ca.cert -CAkey ca.key -days 9999 -CAcreateserial
openssl verify -CAfile ca.pem client.cert
cat client.cert client.key > client.pem

cat > hiya <<EOF
chmod a+x ./hiya

# with ipv4 on localhost:
stunnel -p ca.pem -v 2 -A ca.cert -d 3001 -f -P "" -l ./hiya
openssl s_client -connect localhost:3001 -cert client.cert -key client.key -CAfile ca.cert
# openssl s_client doesn't support ipv6; see, e.g. openssl #1365, #1832
sudo dnshash attach
stunnel -p ca.pem -v 2 -A ca.cert -d -f -P "" -l ./hiya 
ncat -6 -v --ssl --ssl-key client.key --ssl-verify --ssl-cert client.cert 3001


According to draft-jabley-dnsop-missing-mname-00, dyndns updates are supposed to go to the MNAME field of the SOA record of <foo>.

PRIMARY_MASTER=$(dig +short -t soa <foo> | cut -d' ' -f1)

djbdns doesn't contain native support for IPv6. However, the Debian package 'dbndns' seems to have added this support.

If you lack it, it's easy to calculate the entries for your AAAA records like so:

cat > tinydns_aaaa <<EOF
import sys, socket
if len(sys.argv) < 3:
    print "tinydns_aaaa <name> <ip> <ttl>"
print ":%s:28:%s:%s" % (sys.argv[1], "".join("\%o" % ord(c) for c in socket.inet_pton(socket.AF_INET6, sys.argv[2])), sys.argv[3])
chmod a+x tinydns_aaaa
./tinydns_aaaa fe80::1 86400
cat > tinydns_srv <<EOF
import sys 
if len(sys.argv) < 7:
    print "tinydns_srv <service> <priority> <weight> <port> <name> <ttl>"

def format_short(n):
    return "\\%03o\\%03o" % (n / 256, n % 256)

def format_name(n):
    return "".join("\\%03o%s" % (len(a), a) for a in n.split(".")) + r'\000'

service = sys.argv[1]
priority = format_short(int(sys.argv[2]))
weight = format_short(int(sys.argv[3]))
port = format_short(int(sys.argv[4]))
name = format_name(sys.argv[5])
ttl = sys.argv[6]

print ":%s:33:%s%s%s%s:%s" % (service, priority, weight, port, name, ttl)
chmod a+x tinydns_srv
./tinydns_srv 0 0 3001 86400

however, if you've got the version with the IPv6 patches, then go ahead with something like

quick reference:


Should be straightforward to use dnsmasq to provide an IPv6 front to an old-school tinydns...

also useful background:


Installed OpenWRT on my Linksys WRT54G (v2.0). Very easy.


Found that I could no longer ping my IP address from crank.

Examined firewall:

iptables -t mangle -L

Good, no mangling.

iptables -t nat -L

Some NAT, but just a couple of MASQUERADE rules.

iptables -t filter -L

Lots of filtering. In more detail:

iptables -t filter -L INPUT

Some complicated chains:

  • syn_flood rate-limits TCP connection control packets.
  • input_rule is empty
  • input has subchains for zone_wan and zone_lan.
  • zone_lan accepts everything.
  • zone_wan rejects everything not accepted by input_wan.

Okay, let's add an accept rule to input_wan:

iptables -t filter -A input_wan -p icmp -j ACCEPT

Alternately, add:

config 'rule'
        option 'target' 'ACCEPT'
        option '_name' 'ping'
        option 'src' 'wan'
        option 'proto' 'icmp'

to /etc/config/firewall (or to /etc/firewall.user?)


Now that I'm answering pings, I can set up an IPv6 tunnel with the Hurricane Electric tunnelbroker. Easy.

Then install 6tunnel:

opkg install 6tunnel
cat > /etc/config/6tunnel <<EOF
config 6tunnel
        option tnlifname     'he-ipv6'
        option remoteip4        ''
        option localip4         ''
        option localip6         '2001:470:1f06:6f7::2/64'
        option prefix           '2001:470:1f07:6f7::1/64'
/etc/init.d/6tunnel start


To make use of my new tunnel, I need to advertise my prefix to my LAN. We do this with radvd.

Note that the prefix here that we want to advertise is called the 'routed /64' by tunnelbroker.

cat > /etc/config/radvd <<EOF
config interface
        option interface 'lan'
        option AdvSendAdvert 1
        option AdvManagedFlag 0
        option AdvOtherConfigFlag 0
        option AdvHomeAgentFlag 0
        option ignore 0

config prefix 
        option interface 'lan'
        option prefix '2001:470:1f07:6f7::/64'
        option AdvOnLink 1
        option AdvAutonomous 1
        option AdvRouterAddr 0
        option ignore 0
/etc/init.d/radvd start


OpenVPN is a pain to install on OpenWRT because it depends on OpenSSL, which is too big.

Fortunately, we can hack around that:

cat > /bin/myopenvpn <<EOF
cd /tmp
opkg update
opkg download libopenssl
mkdir ssl
tar Ozxf libopenssl* ./data.tar.gz | tar zxC ./ssl
mv ssl/usr/lib/* ssl; rm -rf ssl/usr
cd \$BASE
env LD_LIBRARY_PATH=/tmp/ssl openvpn "\$@"
chmod a+x /bin/myopenvpn

Then edit /tmp/opkg-lists/snapshots to remove the dependency of openvpn.


Follow CA instructions. 
Make sure you put the right CN in your server cert.


openssl dhparam -out dh1024.pem 1024


ntpclient -h -s
cd /etc/openvpn  # or whever you put your certs
myopenvpn --mode server --client-to-client --dev tap --user nobody --group nogroup --tls-server --ca ./ca.pem --cert server.pem --key server.pem --dh dh1024.pem --proto tcp-server &
ip link set tap0 up
brctl addif br-lan tap0


openvpn --user nobody --group nobody --dev tap --tls-remote openwrt --tls-client --ca ca.cert --cert ./client.pem --key client.pem --proto tcp-client --remote openwrt &
ip link set tap0 up


(If you've contributed and don't see your name, don't fret -- just add yourself with a word or two explaining your contribution!)

Future Work