Talk:Network principles

From OLPC
Revision as of 17:03, 22 November 2008 by CScott (talk | contribs) (→‎Comments from Morgan: Respond to Morgan.)
Jump to navigation Jump to search

I think it is very important to emphasize that relying on IPv4 or IPv6 routability means that deployments can improve network conditions independent of changes to the software on the XO. --Michael Stone 15:47, 28 April 2008 (EDT)

Added, thanks. CScott 17:40, 28 April 2008 (EDT)

"fake friends" == "strangers"?

I believe that the term "strangers" better describes what you mean by "fake friends"; the latter incorporates a sense of "deception" where someone pretends to be a friend. Instead you probably refer to users that happen to be physically around and I would assume those to be plain strangers, unless otherwise decided by the user. --Ypod 00:26, 29 April 2008 (EDT)

You're totally right. The only problem with "strangers" is that it doesn't quite capture the idea that these may be people you know, they're just not guaranteed to be. But "strangers" is much better than "fake friends"; I've made the change. CScott 12:22, 9 May 2008 (EDT)

First comments from Ben

Network Principles: "Additional servers may be used as aides or proxies, but the fundamental means to query the state of an XO or to collaborate with its user is to directly connect to it." What does "fundamental" mean? Perhaps you mean "most direct"?

I've changed it to "canonical". Other means are optimizations only; they don't need to work. CScott 12:22, 9 May 2008 (EDT)

"By direct communication we mean the standard socket API and IP protocols on which the internet is built." You seem to be implying that the standard protocols are preferred, but mesh multicast (especially cerebro's implementation) is a clear example in which the standard protocols are not preferred.

Again, mesh multicast should be viewed as an optimization, not a core feature. Everything should work (albeit at reduced efficiency) even without such tricks. Mesh multicast doesn't actually work across the broader internet, so it's not a good primitive on which to base the design. ("Standard" multicast also doesn't really work across the broader internet -- at least not w/o manual configuration.) CScott 12:22, 9 May 2008 (EDT)

Direct presence interrogation: "The fundamental presence mechanism is direct: one XO connects directly to a service running on the other and queries for its status" Again, I think you want "simplest", as your subsequent algorithm makes clear that this is not the recommended method (piggybacking and lazy presence being preferred, and active interrogation being used only when presence info times out).

"canonical", again. Other mechanisms are optimizations. Sure, use an optimization if you can, but you should be able to work w/o them. CScott 12:22, 9 May 2008 (EDT)

"most users have 20 or so friends" I have about 200 buddies on my AIM buddy list. I glanced through my friends on facebook; they often have 600-900 friends (at 98, I am considered an ultraminimalist). Facebook provides constant high-bandwidth presence for all of them, which is only possible due to its centralized, lazy, aggregating architecture. On myspace, the numbers run into the many thousands. We should expect users to behave similarly with our presence service. We should also remember that the presence service bandwidth will only increase, due to strongly desired features like photo buddy icons and live previews of shared activities. The bandwidth will be much lower than Facebook's, but also much larger than the current Presence Service.

good discussion below. i'll respond there. CScott 12:22, 9 May 2008 (EDT)

"The key point is that all hosts should support direct interrogation for presence, even if other efficient mechanisms are used for partial aggregate presence in some situations." OK, though I think you have your emphasis backwards.

"Premature optimization is the root of all evil", etc, etc. Many of our networking scenarios don't actually need heroic measures. CScott 12:22, 9 May 2008 (EDT)

"Our principles above dictate that collaboration mechanisms are built using direct peer-to-peer communication." Umm... except in the case of talking to a user in Google Chat, or any other legacy IM tunneled over Jabber. Now you're using peer-to-peer to mean client-to-server, just like you were complaining about before. Also, "principles" sounds like this is a moral issue; it's not. Finally, what about a wiki?

"Principles" is meant to suggest these are not absolute rules, but rather guidelines (ie, that it's not a moral issue). Communicating with non-XOs is at the mercy of the legacy bits, I can't force Google Chat to decentralize. Wikis are really interesting cases. MikMik is an example of a true peer-to-peer wiki, consistent with this document's recommendations. Note that, as this document recommends, it *can* use a server to help it scale, but doesn't *need* a server. CScott 12:22, 9 May 2008 (EDT)

"Friends are represented internally using the domain name only; there is no "user@" portion." This would make it impossible to be friends with someone on Google Chat. There is no need for this restriction. The domain name can only be "unnecessary" if the user name happens to be "xo", and I hardly see the value of saving 3 characters ("xo@") in an internal representation.

You are totally right! I've changed this. CScott 12:22, 9 May 2008 (EDT)

"Additional presence information ... is exported using a simple service at a well-known port." I read "well-known port" as meaning "fixed standardized port". Having a fixed isomorphism between ports and services has been a disaster from the beginning, and every such daemon (http, ssh, bittorrent, ...) has eventually gained a feature to allow it to run on a user-specified port. This is in response to conflicts (two services want the same port) as well as NAT. Please don't do this. It would be much better to let each computer's XMPP daemon serve up a list of services and ports on request. The DNS entry can specify the port on which the XMPP daemon is listening.

Ben 00:30, 29 April 2008 (EDT)

Friending (Poly)

Scott wrote:

"Although the number of possible links in a network grows according to the square of the number of nodes, the interconnectedness of real social networks is quite limited: most users have 20 or so friends, with a few "super nodes" having 100 or so. Directly querying "real friends" should not be expensive in bandwidth or time."


I think there is a distinct difference between "friending" in the context of social networking sites (facebook, myspace) and the network created by the XOs: the former by definition urges users to increasing their social connectivity by adding new "friends", whereas the latter does not provide a "friend" recommendation mechanism and there is no third party (like facebook) that facilitates fast "friending" growth. As a result, friending done using the XOs might be better representation of the actual friend networks among children with relatively fewer edges.

My point is, Facebook actually _does_ represent an accurate model of friend networks, because most people really do know over 1000 other people well enough to want to know what they're doing. Also, IM is a closer analogy to friending in Sugar, and the number of people on AIM buddy lists is often in hundreds as well. This is despite the fact that adding someone in your buddy list requires typing in a unique textual identifier (their screen name), and the system provides no discovery mechanism. This is even true on ICQ, where the unique identifier is not human-readable. If our discovery mechanisms are nonexistent, we can expect users to have dozens of friends. If our discovery mechanisms are good, we can expect hundreds. Ben 10:34, 29 April 2008 (EDT)


hmmm this makes sense: the better presence/discovery we offer, the more friends they may end up having ;-). The degree of friending may actually be an indication of how successful our presence service is. --Ypod 11:33, 29 April 2008 (EDT)


However, I still think that relying on the fact that a child will have few friends so as not to overload the network with presence queries is a sub-optimal approach, not only because may actually end-up having as many "friends" as they would on Facebook, but also because maintaining up-to-date information only about your friends and having no information (what their profile is, what activities they're sharing) about "strangers" will be boring. If you actually have such information about strangers (assuming that no internet connection/xmpp server is available), then why do you need to query your friends on a different basis? Again, I will elaborate more on this on a separate stub.


In response to Ben's comment: "Additional presence information ... is exported using a simple service at a well-known port." I read "well-known port" as meaning "fixed standardized port". Having a fixed isomorphism between ports and services has been a disaster from the beginning, and every such daemon (http, ssh, bittorrent, ...) has eventually gained a feature to allow it to run on a user-specified port. This is in response to conflicts (two services want the same port) as well as NAT. Please don't do this. It would be much better to let each computer's XMPP daemon serve up a list of services and ports on request. The DNS entry can specify the port on which the XMPP daemon is listening.

I think your concern of services corresponding to well-known ports is valid. I would like to generalize the problem though to the case where activities need to be identified and need to communicate from one XO to another: How do activities get identified? By some unique (per activity) id? By some string name? I will write my thoughts on this on a separate stub.

--Ypod 01:24, 29 April 2008 (EDT)

On Presence updates/User Profiles/Collaboration

On_Presence_updates/User_Profiles/Collaboration

--Ypod 03:37, 29 April 2008 (EDT)

Anything unique to a communications path (e.g., the mesh) ?

Reading the topic 'Direct XO-to-XO peer communication', I saw that you are NOT making any distinction among the paths (e.g., direct mesh vs. through school relay_server vs through internet) used to access the other XO. The target XO either is present, or is not. If the target XO can be reached via multiple routes, a suitable one will be chosen.

Aside from those "under the covers" protocols which handle the mesh communication itself, is there any need for Activities (or Users) to be cognizant that they are interacting over the "mesh" rather than the "internet" ?

[And apart from a possible role in "discovering" the names of potential correspondents, is the 'Jabber server' needed any more?]

Daf's thoughts

  • I agree with Ypod's suggestion of saying "strangers" rather than "fake friends".
    Me, too. CScott 12:39, 9 May 2008 (EDT)
  • Things I like:
    • Maintaining a distinction between OLPC and deployment responsibilities.
    • The separation of discovery, presence and collaboration, and the acknowledgement that protocols sometimes do discovery and presence simultaneously.
    Me, too! CScott 12:39, 9 May 2008 (EDT)
  • "Human-readable names promote compatibility with other network hosts". I don't understand this phrase.
    I've edited this paragraph. I'm talking about interoperation with non-XOs and with legacy software on an XO.
We should say why we want names to be concise. Is it so that they can be typed in? So that they don't take a lot of bandwidth when transmitted?
I'm not convinced by the "logical" part. If we really want names to encode particular information (name, country, school etc.) then we should say up front that the name should encode that information in the principles document. I think "meaningful" is a better name than "logical" for this property.
I think it's more clearly desirable that names are memorable. By "memorable", I mean something like: having seen the name, one can later input it from memory.
You're right: I cited Marc Stiegler's "memorable" term in my revision. CScott 12:39, 9 May 2008 (EDT)
Even allowing for allowing the name part to be input/displayed as unicode rather than punycode, I don't think the proposed DNS scheme will be memorable due to the inclusion of the key hash. I don't think it will be particularly concise.
I clarified that this is only three characters of the hash, and that it's optional. CScott 12:39, 9 May 2008 (EDT)
Zooko's triangle applies; I think it's worth citing it. We already acknowledge that picking secure/meaningful implies centralisation.
Yes, I cited it; no, centralization is not necessary. I provide an option: use centralized delegation of namespace to schoolservers to allow localized uniqueness checks; or else use "enough" randomness in the name chosen to ensure decentralized probabalistic uniqueness. We recommend that schoolservers be "centrally" given good names, which makes our names more "memorable". CScott 12:39, 9 May 2008 (EDT)
  • "By direct communication we mean the standard socket API and IP protocols on which the internet is built." When we say "IP protocols", are we talking about IPv4 and IPv6, or TCP and UDP, or application protocols?
    I more-or-less mean that an application should be able to use the standard IPv4/IPv6 socket API and get something that works. We will, of course, build higher-layer abstractions on top of that, but you shouldn't *need* to use the abstractions in order to connect to an XO. Help clarifying the wording is welcome. CScott 12:39, 9 May 2008 (EDT)
  • "Our principles above dictate that collaboration mechanisms are built using direct peer-to-peer communication." I don't understand which principles this statement follows from.
    I clarified "peer-to-peer" to "serverless" in the principle; does that help? CScott 12:39, 9 May 2008 (EDT)
  • "To manage this case, we strictly limit the rate and size of presence queries." Have we considered also limiting the number of strangers we try to talk to? If we try to poll everybody possible, then the poll interval for each contact becomes long.
    I considered this, but I think slowing down is vastly preferable to selecting an arbitrary subset of the strangers. If I know I'm sitting next to someone, how do I guarantee I'll see that person in my subset? The only reasonable subsetting mechanism I can think of is tied to distance: if my friend doesn't show up, I just need to move closer to them. That seems reasonably intuitive -- but the distance metrics we get from the RF hardware bear a poor relationship to real-world distances, so the intuition breaks down. CScott 12:39, 9 May 2008 (EDT)
    This is a problem that troubled me a lot in the past. Long delays vs. presence of large numbers of nodes. First I think we should do our best by minimizing the cost of adding information about more nodes. At the very least should have a linear cost and each node should account to a minimum cost (forget about colors, nicks, etc). I agree that distance is the first way to filter out nodes once we reach a limit in delays. What is more interesting though, would be to give our users the ability to make their own budget on the timeshare they have available. If we hit an upper limit at 500 nodes for example, we should allow the user to make a presence budget like:
    • 400 nodes: choose the shortest physical distance
    • 80 nodes: choose any of my friends (starting from the closest ones), irrespective of distance
    • 20 nodes: choose any of my friends' friends (two hops in social distance)
I envision that in the long run much of the social network rules ("I hate this guy" etc) should be imposed on the mesh network itself.

--Ypod 17:22, 12 May 2008 (EDT)

  • "There are three main ways in which direct XO-to-XO communication fails". I think this is better stated as two ways: because a NAT is present, or because a firewall is present. NATs block inbound connections as a (desired or undesired) side-effect of the job they perform. Blocking of connections by a firewall is not a side-effect but by design. Many NATs provide (standard or de facto standard) means of opening ports to the outside world, and we should consider using these means so as to be able to help deployments where the use of a NAT is outside of their control. Of course, if more than one person behind a NAT wishes to export the same service, then they need to use a different port which necessitates some sort of out-of-band means of communicating that port. Also, actively communicating with NATs to negotiate port forwarding (as opposed to circumventing the NAT without it cooperation) tends to not work when there are multiple layers of NAT.
    Wad suggested the separation by *use* of NAT/firewall, which I think is useful. The NAT/firewall is present for a variety of reasons, and it's those reasons which ultimately dictate whether "workarounds" of various kinds are possible/desirable. CScott 12:39, 9 May 2008 (EDT)
  • Why are "Direct XO-to-XO peer communication" and "Direct presence interrogation" separate principles? The latter seems to me to be a subset of the former.
    I should probably clarify this. The section on presence mostly talks about separating discovery, presence, and collaboration, and rate-limiting presence, so the "principle" is probably mistitled. CScott 12:39, 9 May 2008 (EDT)
  • I don't see a rationale for why communciations should be are direct. There are lots of reasons why we might want to communicate directly, but we should state which ones are important to us.
  • "Although the name is a direct reference to the machine, the mapping from name to routable address is indirect." I find this statement very confusing. It's making some distinction between direct and indirect that I can't fathom. The statement "The key property is that there is a single name by which anyone anywhere on the network uses to refer to a particular XO -- the names do not depend on the means by which the name is mapped to an address, route or service" later in the same paragraph is quite clear, but I don't understand its connection to the direct/indirect distinction.
  • The architecture proposal doesn't describe a presence interrogation mechanism.
    Working on it. CScott 12:39, 9 May 2008 (EDT)
  • We should consider having a push mechanism as well as a pull mechanism. This saves unnecessary presence polls. A simple way to do this would be to maintain a connection to everybody you want presence notifications from. Presence notifications can be rate-limited on the sending side. Assuming the overhead of maintaining a connection is low, it shouldn't be any more expensive than polling. This is somewhat similar to how Jabber servers implement presence updates.
    Yes, this is (part of) how an on-laptop XMPP server would work. I'll describe this more fully after I've proof-of-concept-implemented it. CScott 12:39, 9 May 2008 (EDT)
    I disagree with maintaining long-lived connections. The cost of a single connection may be low, but having tens or hundreds of them may impose significant overhead. Not to mention the cost of re-establishing TCP connections (did you have TCP in mind?) in a mobile mesh network. I do agree though that there should also be a push mechanism, but I see no reason why it should be connection-based.

--Ypod 17:37, 12 May 2008 (EDT) — Daf 18:30, 7 May 2008 (EDT)

Comments from Morgan

  • What about centrally defined groups - e.g. created by a teacher? How could those be created, and updated? Does that require a server, or can we do it with xmpp: links?
    The relevant XMPP spec has a way to add a friend in a certain group, but omits mention of creating a group with multiple friends at once. At the moment, I'd say the scope of this proposal is primarily concerned with defining how these buddies are represented; centrally-managed groups have been discussed via Moodle and a Jabber Server, and both seem to be reasonable. CScott 17:03, 22 November 2008 (UTC)
  • With rate-limited presence updates on the scale proposed, I think showing the active activity in mesh view becomes less useful as it becomes less accurate. This clustering around the active shared activity was part of the original design, according to Eben.
    If the network has capacity, the rate-limiting won't be visible. This proposal is primarily about separating out the pieces so that collaboration and presence can be implemented and improved separately, and so failure or degradation in the presence service doesn't prevent collaboration. CScott 17:03, 22 November 2008 (UTC)
  • "xo@" -- not to optimise prematurely, but what about non-XO Sugar instances? While I can't quickly think of an alternative short name, something more generic than "xo" might be more appropriate.
I chose 'xo' primarily because it was short and semi-meaningful. I'm not terribly attached to it. Non-XO instances should be free to use 'sugar@' or 'fedora@' or 'classmate@' (but their users will have to type a bit more) or just 'a@'; it shouldn't matter at all to interop. If someone can think of a alternative short pithy name, I'm all for it. CScott 17:03, 22 November 2008 (UTC)

--morgs 16:58, 12 May 2008 (EDT)

XO-to-XO security

+1 to this approach.

Not only do I agree with an approach that resembles ssh as much as posible, but it also allows to encapsulate traditional IP-based communication over a network where IP address were never assigned or used! This very important for communication _within_ a mesh network. To explain this further, I could access http://myfriend.someschool.xs.laptop.org by capturing all IP traffic locally and encapsulating it into frames that are routed using the underlying mesh routing protocol. If we're not going to use IP for routing within the mesh network while we are backwards compatible with IP-based applications, I no longer see a reason why we need IP addresses within the mesh (except maybe for the MPP that acts as an internet gateway)

--Ypod 20:05, 28 June 2008 (UTC)