Talk:Network principles

From OLPC
Revision as of 18:08, 7 May 2008 by Daf (talk | contribs) (Daf's thoughts)
Jump to: navigation, search

I think it is very important to emphasize that relying on IPv4 or IPv6 routability means that deployments can improve network conditions independent of changes to the software on the XO. --Michael Stone 15:47, 28 April 2008 (EDT)

Added, thanks. CScott 17:40, 28 April 2008 (EDT)

"fake friends" == "strangers"?

I believe that the term "strangers" better describes what you mean by "fake friends"; the latter incorporates a sense of "deception" where someone pretends to be a friend. Instead you probably refer to users that happen to be physically around and I would assume those to be plain strangers, unless otherwise decided by the user. --Ypod 00:26, 29 April 2008 (EDT)

First comments

Network Principles: "Additional servers may be used as aides or proxies, but the fundamental means to query the state of an XO or to collaborate with its user is to directly connect to it." What does "fundamental" mean? Perhaps you mean "most direct"?

"By direct communication we mean the standard socket API and IP protocols on which the internet is built." You seem to be implying that the standard protocols are preferred, but mesh multicast (especially cerebro's implementation) is a clear example in which the standard protocols are not preferred.

Direct presence interrogation: "The fundamental presence mechanism is direct: one XO connects directly to a service running on the other and queries for its status" Again, I think you want "simplest", as your subsequent algorithm makes clear that this is not the recommended method (piggybacking and lazy presence being preferred, and active interrogation being used only when presence info times out).

"most users have 20 or so friends" I have about 200 buddies on my AIM buddy list. I glanced through my friends on facebook; they often have 600-900 friends (at 98, I am considered an ultraminimalist). Facebook provides constant high-bandwidth presence for all of them, which is only possible due to its centralized, lazy, aggregating architecture. On myspace, the numbers run into the many thousands. We should expect users to behave similarly with our presence service. We should also remember that the presence service bandwidth will only increase, due to strongly desired features like photo buddy icons and live previews of shared activities. The bandwidth will be much lower than Facebook's, but also much larger than the current Presence Service.

"The key point is that all hosts should support direct interrogation for presence, even if other efficient mechanisms are used for partial aggregate presence in some situations." OK, though I think you have your emphasis backwards.

"Our principles above dictate that collaboration mechanisms are built using direct peer-to-peer communication." Umm... except in the case of talking to a user in Google Chat, or any other legacy IM tunneled over Jabber. Now you're using peer-to-peer to mean client-to-server, just like you were complaining about before. Also, "principles" sounds like this is a moral issue; it's not. Finally, what about a wiki?

"Friends are represented internally using the domain name only; there is no "user@" portion." This would make it impossible to be friends with someone on Google Chat. There is no need for this restriction. The domain name can only be "unnecessary" if the user name happens to be "xo", and I hardly see the value of saving 3 characters ("xo@") in an internal representation.

"Additional presence information ... is exported using a simple service at a well-known port." I read "well-known port" as meaning "fixed standardized port". Having a fixed isomorphism between ports and services has been a disaster from the beginning, and every such daemon (http, ssh, bittorrent, ...) has eventually gained a feature to allow it to run on a user-specified port. This is in response to conflicts (two services want the same port) as well as NAT. Please don't do this. It would be much better to let each computer's XMPP daemon serve up a list of services and ports on request. The DNS entry can specify the port on which the XMPP daemon is listening.

Ben 00:30, 29 April 2008 (EDT)

Comments to Scott and Ben

Scott wrote:

"Although the number of possible links in a network grows according to the square of the number of nodes, the interconnectedness of real social networks is quite limited: most users have 20 or so friends, with a few "super nodes" having 100 or so. Directly querying "real friends" should not be expensive in bandwidth or time."


I think there is a distinct difference between "friending" in the context of social networking sites (facebook, myspace) and the network created by the XOs: the former by definition urges users to increasing their social connectivity by adding new "friends", whereas the latter does not provide a "friend" recommendation mechanism and there is no third party (like facebook) that facilitates fast "friending" growth. As a result, friending done using the XOs might be better representation of the actual friend networks among children with relatively fewer edges.

My point is, Facebook actually _does_ represent an accurate model of friend networks, because most people really do know over 1000 other people well enough to want to know what they're doing. Also, IM is a closer analogy to friending in Sugar, and the number of people on AIM buddy lists is often in hundreds as well. This is despite the fact that adding someone in your buddy list requires typing in a unique textual identifier (their screen name), and the system provides no discovery mechanism. This is even true on ICQ, where the unique identifier is not human-readable. If our discovery mechanisms are nonexistent, we can expect users to have dozens of friends. If our discovery mechanisms are good, we can expect hundreds. Ben 10:34, 29 April 2008 (EDT)


hmmm this makes sense: the better presence/discovery we offer, the more friends they may end up having ;-). The degree of friending may actually be an indication of how successful our presence service is. --Ypod 11:33, 29 April 2008 (EDT)


However, I still think that relying on the fact that a child will have few friends so as not to overload the network with presence queries is a sub-optimal approach, not only because may actually end-up having as many "friends" as they would on Facebook, but also because maintaining up-to-date information only about your friends and having no information (what their profile is, what activities they're sharing) about "strangers" will be boring. If you actually have such information about strangers (assuming that no internet connection/xmpp server is available), then why do you need to query your friends on a different basis? Again, I will elaborate more on this on a separate stub.


In response to Ben's comment: "Additional presence information ... is exported using a simple service at a well-known port." I read "well-known port" as meaning "fixed standardized port". Having a fixed isomorphism between ports and services has been a disaster from the beginning, and every such daemon (http, ssh, bittorrent, ...) has eventually gained a feature to allow it to run on a user-specified port. This is in response to conflicts (two services want the same port) as well as NAT. Please don't do this. It would be much better to let each computer's XMPP daemon serve up a list of services and ports on request. The DNS entry can specify the port on which the XMPP daemon is listening.

I think your concern of services corresponding to well-known ports is valid. I would like to generalize the problem though to the case where activities need to be identified and need to communicate from one XO to another: How do activities get identified? By some unique (per activity) id? By some string name? I will write my thoughts on this on a separate stub.

--Ypod 01:24, 29 April 2008 (EDT)

On Presence updates/User Profiles/Collaboration

On_Presence_updates/User_Profiles/Collaboration

--Ypod 03:37, 29 April 2008 (EDT)

Anything unique to a communications path (e.g., the mesh) ?

Reading the topic 'Direct XO-to-XO peer communication', I saw that you are NOT making any distinction among the paths (e.g., direct mesh vs. through school relay_server vs through internet) used to access the other XO. The target XO either is present, or is not. If the target XO can be reached via multiple routes, a suitable one will be chosen.

Aside from those "under the covers" protocols which handle the mesh communication itself, is there any need for Activities (or Users) to be cognizant that they are interacting over the "mesh" rather than the "internet" ?

[And apart from a possible role in "discovering" the names of potential correspondents, is the 'Jabber server' needed any more?]

Daf's thoughts

  • I agree with Ypod's suggestion of saying "strangers" rather than "fake friends".
  • Things I like:
    • Maintaining a distinction between OLPC and deployment responsibilities.
    • The separation of discovery, presence and collaboration, and the acknowledgement that protocols sometimes do discovery and presence simultaneously.
  • "Human-readable names promote compatibility with other network hosts". I don't understand this phrase.
We should say why we want names to be concise. Is it so that they can be typed in? So that they don't take a lot of bandwidth when transmitted?
I'm not convinced by the "logical" part. If we really want names to encode particular information (name, country, school etc.) then we should say up front that the name should encode that information in the principles document. I think "meaningful" is a better name than "logical" for this property.
I think it's more clearly desirable that names are memorable. By "memorable", I mean something like: having seen the name, one can later input it from memory.
Even allowing for allowing the name part to be input/displayed as unicode rather than punycode, I don't think the proposed DNS scheme will be memorable due to the inclusion of the key hash. I don't think it will be particularly concise.
Zooko's triangle applies; I think it's worth citing it. We already acknowledge that picking secure/meaningful implies centralisation.
  • "By direct communication we mean the standard socket API and IP protocols on which the internet is built." When we say "IP protocols", are we talking about IPv4 and IPv6, or TCP and UDP, or application protocols?
  • "Our principles above dictate that collaboration mechanisms are built using direct peer-to-peer communication." I don't understand which principles this statement follows from.
  • "To manage this case, we strictly limit the rate and size of presence queries." Have we considered also limiting the number of strangers we try to talk to? If we try to poll everybody possible, then the poll interval for each contact becomes long.
  • "There are three main ways in which direct XO-to-XO communication fails". I think this is better stated as two ways: because a NAT is present, or because a firewall is present.
  • NATs block inbound connections as a (desired or undesired) side-effect of the job they perform. Blocking of connections by a firewall is not a side-effect but by design. Many NATs provide (standard or de facto standard) means of opening ports to the outside world, and we should consider using these means so as to be able to help deployments where the use of a NAT is outside of their control. Of course, if more than one person behind a NAT wishes to export the same service, then they need to use a different port which necessitates some sort of out-of-band means of communicating that port. Also, actively communicating with NATs to negotiate port forwarding (as opposed to circumventing the NAT without it cooperation) tends to not work when there are multiple layers of NAT.
  • Why are "Direct XO-to-XO peer communication" and "Direct presence interrogation" separate principles? The latter seems to me to be a subset of the former.
  • I don't see a rationale for why communciations should be are direct. There are lots of reasons why we might want to communicate directly, but we should state which ones are important to us.
  • "Although the name is a direct reference to the machine, the mapping from name to routable address is indirect." I find this statement very confusing. It's making some distinction between direct and indirect that I can't fathom. The statement "The key property is that there is a single name by which anyone anywhere on the network uses to refer to a particular XO -- the names do not depend on the means by which the name is mapped to an address, route or service" later in the same paragraph is quite clear, but I don't understand its connection to the direct/indirect distinction.
  • The architecture proposal doesn't describe a presence interrogation mechanism.
  • We should consider having a push mechanism as well as a pull mechanism. This saves unnecessary presence polls. A simple way to do this would be to maintain a connection to everybody you want presence notifications from. Presence notifications can be rate-limited on the sending side. Assuming the overhead of maintaining a connection is low, it shouldn't be any more expensive than polling. This is somewhat similar to how Jabber servers implement presence updates.