Network2: Difference between revisions
mNo edit summary |
mNo edit summary |
||
(98 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{TOCright}} |
{{TOCright}} |
||
Last updated: [[User:Mstone|Michael Stone]] 04:15, 15 January 2010 (UTC) | '''[[Network2/Paper|paper version]]''' |
|||
= Introduction = |
|||
Sugar's desired realtime collaboration experience can only be provided atop a robust and efficient network stack designed to accommodate automated diagnosis and standardized workarounds -- anything less only wastes students' and teachers' time and patience, contrary to our [[OLPC Human Interface Guidelines/Design Fundamentals/Key Design Principles|human interface guidelines]]. |
|||
Last updated: [[User:Mstone|Michael Stone]] 05:11, 18 July 2009 (UTC) |
|||
This '''unfinished''' essay summarizes an attempt to work out a simple way to realize this sort of network experience, with existing software and hardware, while also demonstrating the sort of thinking which might help other parts of the system achieve the same standard of quality. |
|||
This document proposes a design for networking based on previously realized [[Network Principles]]. It then explores and elaborates the design with analysis, example configuration, and experimental results after which it concludes by crediting those who have contributed to the design and by explaining future work inspired by current results. |
|||
'''Quick links''': '''[[Network2/Paper|the Paper]]''' : (finished/''unfinished'' sections) |
|||
Its purpose is to advance the Network Principles project by explaining how you might build a system based on those principles with currently available tools and by doing a first round of modeling and prototyping in order to gain some analytic and empirical evidence about whether those principles are sound. |
|||
* Prior work: [[networking]], [[collaboration]], [[network principles]] |
|||
Some important quality criteria to consider while reading it include: |
|||
* Background: [[Network2/Purpose|purpose]], [[Network2/Scenarios|scenarios]], [[Network2/Architecture|architecture]] |
|||
* Designs: [[Network2/Design|naming and internetworking]], ''[[Network2/Security|security ideas]]'', [[Network2/Diagnosis|diagnosis techniques]] |
|||
* Analyses: ''[[Network2/Dynamics|cost model]]'', ''[[Network2/Self-test|self-test algorithm]]'' |
|||
* Experiments: [[Network2/Experiments/Dnshash|dnshash]], [[Network2/Experiments/Openvpn|openvpn]], ''[[Network2/Experiments/HE|6to4: HE]]'', ''[[Network2/Experiments/Sixxs|6to4: Sixxs]]'', ''[[Network2/Experiments/Simulation|simulation]]'', ''[[Network2/Experiments/openwrt|openwrt]]'', ''[[Network2/Experiments/tinydns|olpcdyndns1]]'' |
|||
'''Personal goals...''' |
|||
; primum, non nocere |
|||
: People usually think that sufferance of free software is voluntary but this is not so for our users. |
|||
: ''First, do no harm.'' |
|||
# "I want to use familiar tools in my activities, -- like Twisted, curl, ssh, rsync, and email -- both under a tree, in a walled garden, and out on the public Internet, without modification or wrappers." |
|||
; no lock-in |
|||
# "I want a design that has 20% fewer ways to fail, and that offers manual overrides for the failure modes that remain." |
|||
: Does the success of the design depend on any ideas which lack pre-existing interoperating implementations? |
|||
# "I want to chop 2-3 levels from the current collaboration stack's 6-level 'fast-path'." |
|||
: ''What existing software is it presently incompatible with? What does that cost to change?'' |
|||
# "I want to collaborate with people who only have web browsers -- they outnumber people with Jabber clients by millions." |
|||
'''Finally, to help out''', please improve my writing, experiment with my ideas, and share this work with your friends! |
|||
; no ponies |
|||
: How well does the design conform to the physical and social realities which define its niche? |
|||
: ''For example: bandwidth, latency, error, ignorance, interdiction, authority, autonomy...'' |
|||
==Subpages== |
|||
; no required single points of availability failure |
|||
{{Special:PrefixIndex/{{PAGENAMEE}}/}} |
|||
: It might be expensive, but it should be possible to implement this design with no single points of failure. |
|||
: ''How well does the design identify dependencies between components?'' |
|||
When judging, please also note that the design is '''not yet complete''' in several important respects: |
|||
* it has only a stub of a bandwidth model, |
|||
* its self-test algorithm is not yet written, (though good diagnostic primitives are systematically identified) |
|||
* it lacks truly clear implementation guidance and comprehensive sample code, and |
|||
* there are unresolved questions about |
|||
** how routing and timeouts should be configured so that peers search their target address space in a useful fashion |
|||
** how [[communications security]] might best be provided. |
|||
[[Category:Network2]] |
|||
= Design = |
|||
[[Category:Subsystems]] |
|||
== Network Architecture == |
|||
We imagine our network as organized into three layers: |
|||
# a ''link layer'', usually implemented via 802.3 wired Ethernet, 802.11b/g wifi in either ad-hoc or infrastructure mode, or various sorts of tunneling over IPv4, perhaps across NATs and firewalls, |
|||
# an ''internetworking layer'', based on [http://tools.ietf.org/html/rfc2460 IPv6] ([http://tldp.org/HOWTO/html_single/Linux+IPv6-HOWTO/ tutorial documentation]), and |
|||
# a ''naming'' layer, based on [http://tools.ietf.org/html/rfc1034 DNS], for binding logical addresses from networks with different failure modes to stable human-memorable names. |
|||
The motivation for this breakdown is that we want to offer "seamless" support for each of the following general network scenarios, in which we might have: |
|||
# access to at least one shared-media link. |
|||
# a more efficient link, like an 802.3 switch or an 802.11 access point. |
|||
# a bridge, like an XS or a good access point, between two or more otherwise separate single-link networks. |
|||
# a local router, like an XS, routing between two or more otherwise separate (but potentially complicated) local networks |
|||
# credentials for some sort of dedicated local tunnel endpoint (like a SOCKS proxy or an HTTP proxy) |
|||
# a remote router offering us some sort of access to a larger internetwork, typically via (perhaps restricted) IPv4 |
|||
# credentials for some sort of dedicated remote tunnel endpoint (like an SSL or IPsec VPN or a 6to4 tunnel, etc.) |
|||
# a remote router offering great access to a larger internetwork |
|||
Our architecture is also motivated by the need to default to protocols and configurations that offer reasonable service ''while not harming the reliability'' of the links, networks, tunnels, and internetworks to which we have been granted access. |
|||
== Peer IPv6 Configuration == |
|||
Your job is to be an IPv6 node. Consequently, when you bring up your interfaces, |
|||
# You might [http://tools.ietf.org/html/rfc2461 discover] an IPv6 router [http://tools.ietf.org/html/rfc2463 advertising] on one of your links. |
|||
#* (See <tt>sysctl net.ipv6.conf.all.accept_ra</tt> and related variables.) |
|||
# You might try out [https://fedorahosted.org/dhcpv6/ dhcp6c]. |
|||
# You might have some kind of IPv4 connectivity. If so, [http://www.sixxs.net/faq/connectivity/?faq=comparison connect] to the Internet or to other internetworks of your choice. |
|||
#* ([http://www.remlab.net/miredo/ miredo] and [http://openvpn.net/ openvpn] seem particularly easy to configure and hence to experiment with...) |
|||
# Use [[dnshash]] to add guessable link-local addresses to all your links. |
|||
== Server IPv6 Configuration == |
|||
Your job is to be an IPv6 router and a [http://tools.ietf.org/html/rfc1035 DNS server]. One of several situations might obtain: |
|||
# You might discover an IPv6 router advertising one or more IPv6 prefixes on your outbound link(s). |
|||
# You might have some kind of IPv4 connectivity. If so, [http://www.sixxs.net/faq/connectivity/?faq=comparison connect] to the Internet or to other internetworks of your choice. |
|||
# You might be under a tree. If so, generate a [http://tools.ietf.org/html/rfc4193 Unique Local Address] prefix. |
|||
# (Use [[dnshash]] to add guessable link-local addresses to all your links?) |
|||
When done, use [http://www.litech.org/radvd/ radvd] or [https://fedorahosted.org/dhcpv6/ dhcp6d] to share addresses. |
|||
== Server DNS Configuration == |
|||
One of the server's most important jobs is to get itself on appropriate internetworks so that it can dynamically map stable (DNS) names to unstable names (IPv6 addresses) for itself and its peers. |
|||
Unfortunately, the most reliable and secure means of updating these mappings is likely to be bespoke -- [http://tools.ietf.org/html/rfc2136 RFC 2136] is not widely implemented and specifies no concrete security protocol while DNSSEC seems immature at present. |
|||
Consequently, I propose the following strawman update protocol -- exchange an RFC-2136 UPDATE packet and response over your favorite authenticated RPC protocol with the nameserver. |
|||
(My favorite protocol for this sort of thing is currently "json-over-SSH-to-python-and-make", but variations (ucspi-ssl, 9p, etc.) make me smile.) |
|||
''(Other possibilities: maybe DNSSEC isn't so hard? Maybe DNSCurve will be usable? See [http://ipcheck.sourceforge.net/ ipcheck] and [http://ddclient.sourceforge.net ddclient] for contemporary work...)'' |
|||
== Peer DNS Configuration == |
|||
Peers which have been registered with one or more servers need to update those servers when their addresses change using the protocol described above. |
|||
(Note: http://tools.ietf.org/html/rfc4339) |
|||
== Security Ideas == |
|||
# Spoofing, Integrity, Confidentiality. See [[communications security]] and [http://passpet.org/ petnames] for some background. A very rough road along which something reasonable ''might'' lie: |
|||
#* Use [http://dev.laptop.org/git/projects/barcode physical introduction] to CNAME <tt>cscott.michael.laptop.org</tt> to <tt>''<key>''.cscott.laptop.org</tt>. |
|||
#* Then, my [http://dnscurve.org dnscurve]-compatible DNS resolver will refuse to give me addresses unless the nameserver I contact for cscott proves knowledge of cscott's private key. |
|||
#* Then I have a nice basis with which to configure IPsec security associations. |
|||
# System Integrity |
|||
# DoS |
|||
= Analysis = |
|||
== Bandwidth Usage == |
|||
Several important numbers that we need to predict and to measure: |
|||
tx == transmit, rx == receive, btx == broadcast |
|||
btx/tx/rx - ICMPv6+IPv6+phys - router discovery (RD) |
|||
btx/rx - ICMPv6+IPv6+phys - duplicate address detection (DAD) |
|||
tx/rx - ICMPv6+IPv6+phys - NS neighbor discovery (ND) |
|||
tx/rx - UDP+IPv6+phys - DNS query |
|||
tx/rx - JSON+SSH+TCP+IPv6+phys - DNS update |
|||
where "phys" describes the equations' dependence on the "physical" layer's |
|||
frame overhead and MTU |
|||
notable "phys" layers: |
|||
Ethernet -- ad-hoc wifi, infra wifi, 802.11s mesh, switch, hub |
|||
TLS+UDP+IPv4 -- openvpn |
|||
L2TP+IPsec+IPv4 -- raccoon, isakmpd, openswan, etc. |
|||
UDP+IPv4 -- teredo |
|||
== Debugging Techniques == |
|||
Start recording a typescript so that we can see what you did. |
|||
TESTDIR=`pwd`/testing |
|||
mkdir -p $TESTDIR && cd TESTDIR |
|||
script |
|||
ulimit -c unlimited |
|||
Check that you've got the right DNS name for the person you want to talk to. |
|||
NAME=the.right.person |
|||
echo $NAME > peer |
|||
Dump your addresses, routes, and perhaps your open connections. |
|||
hostname --fqdn | tee host |
|||
ip addr show | tee addrs |
|||
ip route show | tee ipv4_routes |
|||
ip -6 route show | tee ipv6_routes |
|||
netstat -anp | tee conns |
|||
If you have wireless devices, |
|||
iwconfig | tee iwconfig |
|||
iwlist scan | tee iwlist_scan |
|||
Fire up tcpdump: |
|||
tcpdump -w packets -s0 & |
|||
Resolve that name to addresses. Check that the addresses seem sane. |
|||
dnshash lookup $NAME | tee peer_addrs_dnshash |
|||
dig $NAME | tee peer_addrs_dig |
|||
See who's answering broadcasts: |
|||
ping6 -I $IFACE ff02::1 |
|||
Route to the addresses: |
|||
ping6 -I $IFACE $ADDR | tee ping |
|||
traceroute6 $ADDR | tee traceroute |
|||
tracepath6 $ADDR | tee tracepath |
|||
Connect to the address: |
|||
nc6 $ADDR $PORT |
|||
# echo "SSH-2.0-Hi" | nc6 $ADDR 22 |
|||
# printf "GET / HTTP/1.0\r\n\r\n" | nc6 $ADDR 80 |
|||
# ssh $ADDR |
|||
# curl -I http://$ADDR/ |
|||
# ... |
|||
Conduct a bandwidth test: |
|||
iperf -c -V $ADDR |
|||
Collect logs from your application and send them to developers: |
|||
kill -SIGINT %1 |
|||
cd .. |
|||
tar c $TESTDIR | lzma -c > logs.tar.lzma |
|||
== Self-Test Algorithm == |
|||
In order for things to "just work", there are many subgoals that need to be satisfied. The purpose of the self-test algorithm is to speed up debugging by quickly and reliably identifying subgoals whose named requirements are satisfied but whose characteristic test fails. |
|||
The form of the self-test algorithm will be a decision-list which may, in the future, be incorporated into software. |
|||
A rough outline of that decision list is: |
|||
Do we have all the network interfaces that we should? |
|||
Is each interface attached to a link? |
|||
Does each interface have a link-local address? |
|||
Is every interface able to ping itself? |
|||
Does link-layer broadcast return responses? |
|||
Does network-layer broadcast return responses? |
|||
# assuming that we have a partner on the same link |
|||
Can we ping our partner? |
|||
Can we hear our partner pinging us? |
|||
Does there seem to be reasonable bandwidth on our link? |
|||
# assuming we have a link-local partner with a name |
|||
Do we and our partner have byte-identical names written down? |
|||
Can we both resolve the name to a link-local address? |
|||
Do we get the same address? |
|||
Can we both ping the address? |
|||
Can I connect to a service running at the address (e.g. ssh) |
|||
# assuming that we have a router |
|||
Can we ping our router? |
|||
Can we traceroute someone upstream of the router? |
|||
... |
|||
== Advice for Coders == |
|||
There are two critical changes that you'll need to make to your design in order to really make it sing. |
|||
First, you'll want to add some mechanism for your users to type in hostnames that they want you to connect to. This lets them do all sorts of cool stuff like: |
|||
* copy-and-paste links from websites or cerebro |
|||
* type in names from a physical display like a blackboard or a handout, |
|||
Second, you'll want to be prepared to re-resolve names in order to get fresh addresses each time your connectivity changes. For the time being, you should do this by calling libc's <tt>[http://linux.die.net/man/3/getaddrinfo getaddrinfo()]</tt> function. |
|||
Third, go check out [http://tools.ietf.org/html/rfc4960 SCTP] ([http://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol wikipedia], [http://linux.die.net/man/7/sctp man page]). It's support for multi-homing, multi-streaming with and without ordering guarantees, and for updating the addresses you're using to talk to your peer on the fly seem particularly serendipitous. |
|||
== Advice for Deployers == |
|||
Ask your ISPs to provide IPv6 prefixes or tunnel endpoints. After all -- if none of their customers ask, then what incentive will they ever have to upgrade? |
|||
Failing that, see if you (or a local university?) can afford a public IPv4 address -- even if it's dynamic. If so, you can be many sorts of tunnel endpoint. |
|||
Regardless, if you manage to get a globally reachable IPv6 address by any means, then you can provide a DNS server for your kids and it can direct them to one another and to any other services that you feel like pointing them at. |
|||
= Experiments = |
|||
== Link-local configuration == |
|||
Try out [[dnshash]] on an isolated access point, ad-hoc network, switch, or hub. |
|||
Observations: very pleasant! |
|||
== VPN server configuration == |
|||
In this experiment, we're going to configure openvpn and radvd on a machine (teach.laptop.org) with a public IPv4 address. Truthfully, this combination is probably overkill, but the task of constructing it seemed like it might to offer valuable experience, e.g. for someone who wants to bridge multiple kinds of tunnel endpoint or who wants to load-balance lots of peers between a couple of endpoints. |
|||
# Install our VPN and route advertisement software. |
|||
apt-get install openvpn radvd |
|||
# yum -y install openvpn radvd |
|||
# add nobody:nobody |
|||
groupadd nobody |
|||
useradd nobody |
|||
usermod -a -G nobody nobody |
|||
# Configure radvd |
|||
cat > /etc/radvd.conf <<EOF |
|||
interface tap0 |
|||
{ |
|||
AdvSendAdvert on; |
|||
MinRtrAdvInterval 30; |
|||
MaxRtrAdvInterval 100; |
|||
prefix 1234:db8:1:0::/64 |
|||
{ |
|||
AdvOnLink on; |
|||
}; |
|||
}; |
|||
EOF |
|||
# enable forwarding everywhere |
|||
sysctl -w net.ipv6.conf.all.forwarding=1 |
|||
# flush the forwarding table |
|||
ip6tables -F FORWARD |
|||
# really, I /want/ a multi-user version of |
|||
# openvpn --dev tap --user nobody --group nobody --verb 6 |
|||
# but I'm not sure how to get that. instead, I'll use some fake keys and no ciphers. |
|||
mkdir -P keys && cd keys |
|||
wget http://teach.laptop.org/~mstone/sample-keys.tar.bz2 |
|||
tar xf sample-keys.tar.bz2 && cd sample-keys |
|||
# create a multi-user tunnel |
|||
openvpn --mode server --client-to-client --dev tap --user nobody --group nobody --verb 6 --opt-verify --tls-server --client-connect /bin/true --auth-user-pass-optional --duplicate-cn --auth-user-pass-verify /bin/true via-env --dh ./dh1024.pem --ca ./ca.crt --cert client.crt --key client.key --script-security 3 --auth none --cipher none & |
|||
# at any rate, bring up the interface so that we get link-local addresses |
|||
ip link set tap0 up |
|||
# turn on the route advertisement daemon |
|||
radvd -d 5 -m stderr & |
|||
== VPN client configuration == |
|||
The purpose of this experiment was to test the VPN configuration described immediately above. |
|||
# install vpn client |
|||
apt-get install openvpn |
|||
# yum -y install openvpn |
|||
# add nobody:nobody |
|||
groupadd nobody |
|||
useradd nobody |
|||
usermod -a -G nobody nobody |
|||
# download fake keys. |
|||
mkdir -P keys && cd keys |
|||
wget http://teach.laptop.org/~mstone/sample-keys.tar.bz2 |
|||
tar xf sample-keys.tar.bz2 && cd sample-keys |
|||
# connect to the vpn |
|||
openvpn --user nobody --group nobody --dev tap --remote teach.laptop.org --tls-client --ca ca.crt --cert ./client.crt --key client.key --auth none --cipher none & |
|||
# bring up the interface |
|||
ip link set tap0 up |
|||
# find other people |
|||
ping6 -I tap0 ff02::1 |
|||
# if using dnshash, attach |
|||
dnshash attach <your>.<domain>.<name> |
|||
# ... test, as described above ... |
|||
Observations: |
|||
* TLS imposes a high latency cost, even with null algorithms. |
|||
* TAP devices work rather nicely, at least for tiny networks. |
|||
* Be careful of firewall rules! |
|||
* radvd is ''perhaps'' unnecessary with a single virtual ethernet -- dnshash "suffices" -- though it might be useful for routing between several load-balanced ethernets. |
|||
* The default IP sorting rules and route priorities mean that it may take a long time for a connecting app like ssh or nc6 to connect to the /correct/ dnshash address. |
|||
= Credits = |
|||
''(If you've contributed and don't see your name, don't fret -- just add yourself with a word or two explaining your contribution!)'' |
|||
* {{credit|[[Profiles/mstone|Michael Stone]]|none|writing}} |
|||
* {{credit|[[Profiles/cscott|C. Scott Ananian]]|OLPC|architecture,teaching}} |
|||
* {{credit|[[Profiles/wad|John Watlington]]|OLPC|architecture}} |
|||
* {{credit|[[Profiles/robot101|Robert McQueen]]|Collabora|prior work,critique}} |
|||
* {{credit|[[Profiles/daf|Dafydd Harries]]|Collabora|prior work,critique}} |
|||
* {{credit|[[Profiles/ypod|Polychronis Ypodimatopolous]]|MIT|prior work,critique}} |
|||
* {{credit|[[Profiles/csetlow|Cortland Setlow]]|Tower Research Capital|testing}} |
|||
* {{credit|[[Profiles/aa|Andres Ambrois]]||design,testing}} |
|||
* {{credit|[[Profiles/bemasc|Benjamin Schwartz]]|Harvard|critique,publicity}} |
|||
* {{credit|[[Profiles/tabitha|Tabitha Roder]]||testing}} |
|||
= Future Work = |
|||
* Per-host networks and per-app IPs and names. |
|||
* Sample code. |
|||
* Designs for [[User:Mstone/Higher protocols|higher protocols]] like discovery, presence, and health. |
Latest revision as of 04:15, 15 January 2010
Last updated: Michael Stone 04:15, 15 January 2010 (UTC) | paper version
Sugar's desired realtime collaboration experience can only be provided atop a robust and efficient network stack designed to accommodate automated diagnosis and standardized workarounds -- anything less only wastes students' and teachers' time and patience, contrary to our human interface guidelines.
This unfinished essay summarizes an attempt to work out a simple way to realize this sort of network experience, with existing software and hardware, while also demonstrating the sort of thinking which might help other parts of the system achieve the same standard of quality.
Quick links: the Paper : (finished/unfinished sections)
- Prior work: networking, collaboration, network principles
- Background: purpose, scenarios, architecture
- Designs: naming and internetworking, security ideas, diagnosis techniques
- Analyses: cost model, self-test algorithm
- Experiments: dnshash, openvpn, 6to4: HE, 6to4: Sixxs, simulation, openwrt, olpcdyndns1
Personal goals...
- "I want to use familiar tools in my activities, -- like Twisted, curl, ssh, rsync, and email -- both under a tree, in a walled garden, and out on the public Internet, without modification or wrappers."
- "I want a design that has 20% fewer ways to fail, and that offers manual overrides for the failure modes that remain."
- "I want to chop 2-3 levels from the current collaboration stack's 6-level 'fast-path'."
- "I want to collaborate with people who only have web browsers -- they outnumber people with Jabber clients by millions."
Finally, to help out, please improve my writing, experiment with my ideas, and share this work with your friends!
Subpages
- Network2/Advice
- Network2/Architecture
- Network2/Audience
- Network2/Concept/Address
- Network2/Concept/Bandwidth
- Network2/Concept/Bridge
- Network2/Concept/Capacity
- Network2/Concept/Interface
- Network2/Concept/Internetwork
- Network2/Concept/Jitter
- Network2/Concept/Latency
- Network2/Concept/Layer
- Network2/Concept/Link
- Network2/Concept/Medium
- Network2/Concept/Name
- Network2/Concept/Network
- Network2/Concept/Protocol
- Network2/Concept/Router
- Network2/Concept/Scenario
- Network2/Concept/Tunnel
- Network2/Credits
- Network2/Design
- Network2/Diagnosis
- Network2/Dynamics
- Network2/Experiments
- Network2/Experiments/Dnshash
- Network2/Experiments/HE
- Network2/Experiments/OpenWRT
- Network2/Experiments/Openvpn
- Network2/Experiments/Simulation
- Network2/Experiments/Sixxs
- Network2/Experiments/tinydns
- Network2/Future work
- Network2/Paper
- Network2/Purpose
- Network2/Scenarios
- Network2/Security
- Network2/Self-test