Mesh Network Details: Difference between revisions

From OLPC
Jump to navigation Jump to search
(Replaced FAQ with Mesh Technical Overview. Moved FAQ.)
Line 1: Line 1:
Details of the [[Networking|mesh networking]] provided by the XO laptop are described here.
Details of the [[Networking|mesh networking]] provided by the XO laptop are described here. See also the [[Mesh_Network_FAQ|Mesh Network FAQ]].


Right now this has the form of an FAQ. Hopefully it will later be rewritten into a more coherent description.


=== Background ===
===How many servers per school?===


The Mesh Routing Protocol used in the OLPC laptop (OLPC-Mesh) is based on the 802.11s standard being developed by the 802.11 Task Group S [http://www.ieee802.org/11/Reports/tgs_update.htm].
How will the [[School server|school servers]] in one school be connected, and how many users each are they expected to support?


OLPC-Mesh was based on the first draft produced by TGs, version 0.1. At the time of this writing, TGs is working on version 1.0 of the draft.
:The [[XS_Server_Specification#Scalability|ratio of students to school servers]] is currently planned to be no more than 100:1. The interconnection between servers will be best possible: Cat3/5 or powerline is strongly recommended, with a fallback to wireless if necessary.


=== Design goals ===
:Each server will support at least two WiFi access points (the Marvell modules), with up to five or six access points possible. The optimum deployment scenario probably provides two or three meshes per school (on channels 11, 1, and 6). Each server provides access points on two or more meshes (i.e. there are multiple servers/access points on each mesh.)


These were some of the design requirements/constraints for the project:
===How is the mesh channel for a laptop to join chosen?===
:You have to spend some time on every channel and estimate how heavy the traffic is from the RREQ and RREP packets.


* Simultaneously act as a Mesh Point as well as an infrastructure node.
::This is tricky, as these packets are not sent to the host. One way to determine this indirectly is to examine the forwarding table: when RREQs are received, reverse routes are created. Alternatively, one could monitor the forwarding statistics of the mesh interface (ethtool -S msh0). These are all "passive" detection methods: will not work if all the hosts in the mesh are silent.
* Capable of acting as a standalone mesh node when main CPU is off.
* Support asymetric links/paths.
* Support for power save mode.
* Small enough to be running on Marvell's 88W8388 802.11 wireless module.
* Incremental releases. All releases are must be useable and include more functionality than the previous.
* Follow 802.11s draft when possible.


=== Mesh Path Selection and Forwarding ===
::Yet another alternative would be to run daemons attached to the mesh interface, such as [http://www.cozybit.com/projects/lsmesh http://www.cozybit.com/projects/lsmesh]


The path selection mechanism is based on a simplified version of the Hybrid Wireless Mesh Protocol (HWMP) proposed in the 802.11s draft. HWMP combines on-demand route discovery with support for proactive routing.
::The mesh channel selection algorithm needs to be debated. A proposal is below--[[User:Wad|Wad]]


Proactive routing requires the formation of a tree topology under a root node. OLPC-Mesh does not support proactive routing at this time.
====Mesh channel selection algorithm====


On-demand path discovery is largely based on Advanced On-demand Distance Vector (AODV) routing.
The device will scan for activity on all three channels being proposed for OLPC meshes (1, 6, and 11) before making any decision.


Paths are built using a route request / route reply management frames. When a source node needs to transmit a frame to a destination for which no path exists, a broadcast route request (RREQ) is broadcast through the mesh. As these requests are propagated, nodes receiving them will create routes to the source node in their routing tables. These routes are termed *reverse routes* and are only used to forward mesh management frames. When a node receives a RREQ destined to itself, it will respond with a unicast route reply (RREP), which will be sent back to the source via '''reverse routes'''. The intermediate nodes that forward RREPs back to the source will create routes to the destination node. This routes are termed '''forward routes''', and are the routes used to forward data frames.
# If there is only activity on a single channel, it will be selected.
# If there is activity on multiple channels, the active channel with the least activity will be selected.
# If all channels show the same activity, then channel 11 will be selected.


=== Route Teardown and Recovery ===
This algorithm should be modified to take the number of hops to a gateway on a particular mesh into account.


If a frame cannot be transmitted to the next hop (i.e. when the maximum nuber of retries is exceeded), the route that was used for the frame is marked as invalid. If the failed route has predecessors, route error (RERR) management frames are transmitted to the source of the route. This improves the route recovery time after a mesh node leaves the coverage area of a neighbor.
===How can we determine whether a channel has other active mesh users?===


=== Limited Broadcast ===
:There are no beacons currently (although they are in the implementation plan) so you really have to listen for mesh traffic.


The RREQ/RREP mechanism only works for unicast traffic. Broadcast traffic is propagated through the mesh through limited flooding. Each mesh data frame contains a unique end-to-end sequence number that is set at the source. Intermediate nodes maintain a list of '''recently broadcast''' frames indexed by this sequence number and the address of the source. This table ensures that broadcast frames are retransmitted only once.
===How many radios can one mesh channel support?===


=== Limitations ===
How will a laptop decide to join another mesh channel if the current one gets too busy?


Under HWMP, a Mesh Point (MP) uses active or passive scanning to discover neighbors and establish peer links. OLPC-Mesh does not use this mechanism. Neighbors are discovered only via the RREQ/RREP cycle, and no neighbor authentication is performed. This means that the mesh is not protected against route disruption or node isolation attacks.
:Is there an answer ?


As we are using hardware that was designed prior to the 802.11s draft, we cannot use the new mesh frame type, identified by type = 0x3 in the Frame Control field. Instead we are using WDS frames extended with mesh specific header fields.
===Will physically adjacent laptops be on the same mesh ?===


There is no multicast support at this time. This will be implemented in the next weeks.
Is it possible that two children sitting next to one another are on different channels and therefore cannot "see" each other on the net?
:One of the roles of the [[School server]] will be to bridge between mesh clouds running on different channels. How do you decide what mesh (channel) to join was a previous question.


===What about WiFi encryption ?===
=== Userspace Controls ===


There are several system calls available to examine and modify the behavior of the OLPC-Mesh. This calls are implemented as ioctls, and can be invoked via iwpriv commands.
Does the mesh part of the firmware use the same encryption settings as the b interface? Do we care about making a 'private mesh' with WPA-PSK or WEP or something like that?


The first of such tools are the <code>iwpriv fwt_*</code> family of commands. With these commands one can examine and modify the routing table. See the README file in the libertas driver directory for details.
:Yes, the mesh uses the same encryption.


Another useful feature for debugging and testing is the blinding table. Incomming traffic from any address that exists in the blinding table will be silently discarded by the firmware. This is useful to test specific mesh topologies that would otherwise be hard to setup. The blinding table can be accessed using <code>iwpriv bt_{add,del,reset,list}</code>.
:Down the road, we care, and will probably use WPA-PSK.--Mihalis


There is also one ioctl call that will change the maximum TTL of outgoing mesh traffic. The TTL determines the maximum number of hops that a frame will cross before being dropped. This is used to minimize the consequences of routing loops but it also limits the number of neighbors that can be reached in the mesh. The mesh TTL can be modified via <code>iwpriv mesh_{get,set}_ttl</code>.
::I will argue that link layer encryption is the wrong place to protect secrets. If an application handles private or sensitive data, it should apply encryption at that time (e.g. HTTPS). My concern is the management overhead of the authentication server for WPA-PSK. The ability of devices other than XO laptops to join the school network will be supported. --[[User:Wad|Wad]] 00:21, 22 February 2007 (EST)


Finally, there are mesh specific statistics available through <code>ethtool -S</code>
===How are school servers (network gateways) discovered?===
Currently the following counters are implemented:


drop_duplicate_bcast
Will servers send out some sort of announcements to allow the laptops to find them automatically, or must we cache a DNS name or IP address for the server?
drop_ttl_zero
drop_no_fwd_route
:They will act as gateways and respond to RREQs for a reserved anycast address.--Mihalis
drop_no_buffers
fwded_unicast_cnt
fwded_bcast_cnt
drop_blind_table
tx_failed_cnt


=== Mesh Portals ===
:This is two different questions. At the networking level, the laptop is looking for a default gateway. This should be supplied either by the anycast mechanism or by DHCP (discussion?). At the server level, it is still unclear how a laptop will be associated with the school server that contains a student's journal and backups.--[[User:Wad|Wad]] 00:21, 22 February 2007 (EST)

Up to now we have described the operation of Mesh Points. Mesh Points that are connected to an external network, and that forward traffic in and out of the mesh are referred to as Mesh Portals (MPP).

Mesh Points must find paths to a Mesh Portal in order to access the Internet. When multiple Mesh Portals exist in the mesh, the Mesh Point must select one of them. The way OLPC Mesh resolves this problem is by defining a layer 2 anycast address that will be claimed by all the MPPs in the mesh. When a Mesh Point needs to find an MPP, a RREQ is sent for that special anycast address. Each Mesh Portal receiving the RREQ will generate a RREP. The path selection method will assign higher precedence to those MPPs that can be reached through lower cost routes.

Mesh Portals must listen for configuration requests sent by Mesh Points. In reply to these requests, Mesh Portals will send to the Mesh Points all the information required to access outside the mesh network. At this time this configuration information is comprised of the IP address of the selected Mesh Portal and the addresses of DNS servers. More information about this configuration daemon can be found here: http://www.cozybit.com/projects/mpp-utils

=== Mesh Interface ===
The wireless driver creates a virtual network interface just for mesh traffic (msh0). The main interface (eth0) is used for infrastructure traffic when the laptop is associated to an AP. Traffic forwarding in and out of the mesh is done at layer 3 via Network Address Translation (NAT) at the host. This gives the flexibility to use any other network connection to connect the mesh to the world (e.g. ppp, GPRS, etc.).

=== Footnotes ===
[1] Note that although the frame is discarded, it will still be acknowledged by the MAC layer.

--[[User:Jcardona|Jcardona]] 14:03, 23 February 2007 (EST)

Revision as of 19:03, 23 February 2007

Details of the mesh networking provided by the XO laptop are described here. See also the Mesh Network FAQ.


Background

The Mesh Routing Protocol used in the OLPC laptop (OLPC-Mesh) is based on the 802.11s standard being developed by the 802.11 Task Group S [1].

OLPC-Mesh was based on the first draft produced by TGs, version 0.1. At the time of this writing, TGs is working on version 1.0 of the draft.

Design goals

These were some of the design requirements/constraints for the project:

  • Simultaneously act as a Mesh Point as well as an infrastructure node.
  • Capable of acting as a standalone mesh node when main CPU is off.
  • Support asymetric links/paths.
  • Support for power save mode.
  • Small enough to be running on Marvell's 88W8388 802.11 wireless module.
  • Incremental releases. All releases are must be useable and include more functionality than the previous.
  • Follow 802.11s draft when possible.

Mesh Path Selection and Forwarding

The path selection mechanism is based on a simplified version of the Hybrid Wireless Mesh Protocol (HWMP) proposed in the 802.11s draft. HWMP combines on-demand route discovery with support for proactive routing.

Proactive routing requires the formation of a tree topology under a root node. OLPC-Mesh does not support proactive routing at this time.

On-demand path discovery is largely based on Advanced On-demand Distance Vector (AODV) routing.

Paths are built using a route request / route reply management frames. When a source node needs to transmit a frame to a destination for which no path exists, a broadcast route request (RREQ) is broadcast through the mesh. As these requests are propagated, nodes receiving them will create routes to the source node in their routing tables. These routes are termed *reverse routes* and are only used to forward mesh management frames. When a node receives a RREQ destined to itself, it will respond with a unicast route reply (RREP), which will be sent back to the source via reverse routes. The intermediate nodes that forward RREPs back to the source will create routes to the destination node. This routes are termed forward routes, and are the routes used to forward data frames.

Route Teardown and Recovery

If a frame cannot be transmitted to the next hop (i.e. when the maximum nuber of retries is exceeded), the route that was used for the frame is marked as invalid. If the failed route has predecessors, route error (RERR) management frames are transmitted to the source of the route. This improves the route recovery time after a mesh node leaves the coverage area of a neighbor.

Limited Broadcast

The RREQ/RREP mechanism only works for unicast traffic. Broadcast traffic is propagated through the mesh through limited flooding. Each mesh data frame contains a unique end-to-end sequence number that is set at the source. Intermediate nodes maintain a list of recently broadcast frames indexed by this sequence number and the address of the source. This table ensures that broadcast frames are retransmitted only once.

Limitations

Under HWMP, a Mesh Point (MP) uses active or passive scanning to discover neighbors and establish peer links. OLPC-Mesh does not use this mechanism. Neighbors are discovered only via the RREQ/RREP cycle, and no neighbor authentication is performed. This means that the mesh is not protected against route disruption or node isolation attacks.

As we are using hardware that was designed prior to the 802.11s draft, we cannot use the new mesh frame type, identified by type = 0x3 in the Frame Control field. Instead we are using WDS frames extended with mesh specific header fields.

There is no multicast support at this time. This will be implemented in the next weeks.

Userspace Controls

There are several system calls available to examine and modify the behavior of the OLPC-Mesh. This calls are implemented as ioctls, and can be invoked via iwpriv commands.

The first of such tools are the iwpriv fwt_* family of commands. With these commands one can examine and modify the routing table. See the README file in the libertas driver directory for details.

Another useful feature for debugging and testing is the blinding table. Incomming traffic from any address that exists in the blinding table will be silently discarded by the firmware. This is useful to test specific mesh topologies that would otherwise be hard to setup. The blinding table can be accessed using iwpriv bt_{add,del,reset,list}.

There is also one ioctl call that will change the maximum TTL of outgoing mesh traffic. The TTL determines the maximum number of hops that a frame will cross before being dropped. This is used to minimize the consequences of routing loops but it also limits the number of neighbors that can be reached in the mesh. The mesh TTL can be modified via iwpriv mesh_{get,set}_ttl.

Finally, there are mesh specific statistics available through ethtool -S Currently the following counters are implemented:

 drop_duplicate_bcast
 drop_ttl_zero
 drop_no_fwd_route
 drop_no_buffers
 fwded_unicast_cnt
 fwded_bcast_cnt
 drop_blind_table
 tx_failed_cnt

Mesh Portals

Up to now we have described the operation of Mesh Points. Mesh Points that are connected to an external network, and that forward traffic in and out of the mesh are referred to as Mesh Portals (MPP).

Mesh Points must find paths to a Mesh Portal in order to access the Internet. When multiple Mesh Portals exist in the mesh, the Mesh Point must select one of them. The way OLPC Mesh resolves this problem is by defining a layer 2 anycast address that will be claimed by all the MPPs in the mesh. When a Mesh Point needs to find an MPP, a RREQ is sent for that special anycast address. Each Mesh Portal receiving the RREQ will generate a RREP. The path selection method will assign higher precedence to those MPPs that can be reached through lower cost routes.

Mesh Portals must listen for configuration requests sent by Mesh Points. In reply to these requests, Mesh Portals will send to the Mesh Points all the information required to access outside the mesh network. At this time this configuration information is comprised of the IP address of the selected Mesh Portal and the addresses of DNS servers. More information about this configuration daemon can be found here: http://www.cozybit.com/projects/mpp-utils

Mesh Interface

The wireless driver creates a virtual network interface just for mesh traffic (msh0). The main interface (eth0) is used for infrastructure traffic when the laptop is associated to an AP. Traffic forwarding in and out of the mesh is done at layer 3 via Network Address Translation (NAT) at the host. This gives the flexibility to use any other network connection to connect the mesh to the world (e.g. ppp, GPRS, etc.).

Footnotes

[1] Note that although the frame is discarded, it will still be acknowledged by the MAC layer.

--Jcardona 14:03, 23 February 2007 (EST)