Instant messaging challenges
Use Cases
To begin with, let's try to list the use cases for the OLPC IM. They may not be the same as any existing IM system.
Talk to my neighbor
In this case, the child wants to send a message to someone who is physically nearby, either in the same class or next door. Since the sender and recipient are both within wireless range, the message should be sent directly to the recipient. This has been done before on Cybiko messaging.
- Retroshare.sf.net is an Instant Messenger, which delivers offline messages later and is serverless, coded in c++ and ready for linux. As well Group-Chat is possible to teach a class or more than one person.
Talk to a classmate later
In this case, the child wishes to send a message to a classmate or a friend who is not in the vicinity right now. They would like the message to be stored and automatically delivered the next time that the OLPC is powered up in the vicinity of the recipient. The message will be sent directly to the recipient. Perhaps the recipient is a classmate and will receive the message tomorrow morning when they arrive at school. This means that the OLPC IM system should have the concept of Inbox, Outbox, Draft and Waiting to be Sent, similar to European GSM mobiles.
Talk to someone in another village
In this case, the child will never be within wireless range of the recipient. But, there is a relaypoint in the village that will collect such messages for delivery. This is much like a Pony Express model of email such as implemented in Cambodia's Motoman project. Some people will argue that this is email, not IM, and the two should be kept separate. However, if you research the historical use of postal mail, back in the late 1800's when many cities had twice a day delivery, you will see that even the foundation of the email model had IM-like characteristics. There is a mathematical model called small world network that will help in designing this.
Talk to someone with an Internet email address
This is where a message is relayed to a gateway with the Internet email system. This is intended for use by teachers but could also be used by older children in some circumstance. It will not cause a SPAM problem because use of the Internet email gateway will be highly regulated. This has been done before, complete with the heavy hand regulating the service, in FidoNET-Internet gateways.
Find a friend
The IM system will have an address book or buddy list. Since the primary use of these systems is in a village or school and all the users are expected to know one another, the buddy list should automatically be populated by the wi-fi networking system when another OLPC is detected nearby. The address information should also be saved to facilitate the sending of messages to someone who will be connected at a later point in time.
Relay a message to someone who is a few hops away
At this point, the recipient is not in wireless range, however another OLPC is in range and they can reach the recipient within some maximum limit of hops. In order to determine that a message can be relayed, the system must use discovery algorithms similar to those used by peer-to-peer file transfer systems when they identify what files are available for download. The discovery and transfer should normally happen automatically without asking the user's permission, but it should be mildly encrypted and should not be deleted until delivery is confirmed. This has been implemented before by Cybiko.
Relay a message to someone who is far away
This is really the Motoman use case. In this case, no discovery is performed for the recipient. Instead, discovery is performed for the nearest relay point and the message is transfered there either directly or through a few wireless hops. The relay point will deliver all messages to another relay point that may be able to get closer to the recipient. In order for this to function, an addressing system needs to be developed to assist the relaypoints in making routing decisions. This is reminiscent of UUCP email relaying but does not mean that UUCP addressing should be used. Since OLPCs are issued in restricted geographical areas, it is possible for all relaypoints to contain a complete list of every other relaypoint and the path to it. This may require borrowing some ideas from early routing protocols like RIP.
State of The Art
Instant messaging applications are traditionally designed to require continuous access to a centralized server. The level of dependence on that central connection varies, with some protocols (e.g. AOL's Oscar and Yahoo's YMSG protocol) allowing clients to establish peer to peer connections for message and file exchange, although an established connection with the centralized server is still a requirement in most of those scenarios. And even in protocols that managed to seperate presence establishment from a continuous TCP connection (e.g. SIP/SIMPLE), access to the centralized server is still required to authenticate users, store and retrieve buddy lists, subscribe to other users' presence, and provide message routing between the different connected clients, without which none of the basic IM fuctionality is usable.
This approach was very beneficial for the development of instant messaging applications and protocols. It allowed instant messaging providers to provide a relatively high quality of service (you always expect your messages to go through, and get angry when your buddy appears online and doesn't answer those messages), while at the same time offering a large set of features that proved very useful in making instant messaging much more usable, and allow for the transition from the home PCs of teenagers to the busy desktops of Wall street brokers. Some of those features made possible through centralization is reliable immediate presence, offline messaging (Yahoo, ICQ, Sonork), roaming buddy lists (a problem email clients and servers are still struggling to solve) to name few.
Instant Messaging And The $100 Laptop
First of all, the challenge is not to create an instant messaging application from scratch. The OLPC already includes an IM application that works over wireless and allows for the exchange of drawings in SVG format. Rather than create something new, it would be more productive to enhance and support what already exists.
The challenges become very clear when you start comparing the networking environments in which the $100 laptops will be operating to the average reliability of an internet connection in a community in Maine or France. The $100 laptops will typically be used in environments where internet connectivity is very scarce if available at all. Laptops will also often be connected to a set of other laptops in a neighborhood through a mesh network, but with no access to a central school server that could help broker instant messaging communication.
To make the situation even more challenging, the expected quality of service from an instant messaging application is typically higher than other disconnected means of communication (e.g. email) due its synchronous and conversational nature. Presence also starts losing its usefulness fairly quickly as it starts getting inaccurate. The whole value of presence, is providing a user with the ability to optimize communication based on pushed presence information. You wouldn't try to send someone a message asking them a quick question, asking them to pass by, or quickly execute a transaction if they appear as away or offline.
The list of challenges that will need to be solved to allow instant messaging to operate in such an environment is simply the list of basic features that are offered by centralized servers today, and to which peer to peer or distributed alternatives will need to be developed.
One important thing to note before listing those features, is that any solutions developed are not supposed to replace a centralized instant messaging server. A centralized server is still required, at a minimum, to provide centralized-by-design services such as provisioning and trust establishment. It is also important to note that an end solution would probably work best if it utilizes a centralized server when one is available, and utilizes new techniques for connectivity and presence propagation when one is not available.
Also, an end solution does not necessarily require protocol modifications, or the development of a new set of tools. Solutions could be found by rearranging the tools that exist today in terms of function or the way they are deployed. This approach, given the need for reliability and immediate scalability (given the timelines), is deemed preferable. Although the end solution will most likely consist of a mix of those different approaches. I will not delve into what the solution would look like in this write up, leaving it to another write up and to further discussion as similar problems will need to be solved for other types of communication applications. This write up is intended to stimulate the discussion and provide a generic framework for thinking about the problems and their solutions.
Challenges
- Authentication: This is one of the most fundamental features a centralized instant messaging server provides. Even in the most distributed of peer to peer application servers, authentication is almost always done through a centralized server (think Skype). The reason this is so, is because authentication is very tied to the provisioning functionality for which the server is a natural and secure host. This doesn't mean though, that end user authentication cannot be temporarily delegated to end points through distributed trust mechanisms (think public keys and kerberos as approximate models) and cached for in between connectivity to the centralized server.
Another simpler, but at least one level less secure, is the association of user identities with hardware or network uniquely identifiable attributes. This option becomes much simpler if the solution is to be built on top of an IPv6 infrastructure. Security as noted though is a big risk. Given the state of email today, such a system will not necessarily be much less secure than email, but it will be a step backwards in terms of challenges that IM solved much more effectively than email, namely spam.
- Buddy List Management and Storage: Roaming buddy lists are one of the features that almost all instant messaging protocols implement today. They allow a user to have the same view regardless of which machine or client they log in from. In addition to storing a list of buddies whose presence the user is interested in monitoring, buddy lists store more and more information related to user's capabilities, preferences and privacy controls. They could be overloaded to store additional information that would help solve some of the other challenges, such as public keys.
One additional challenge that needs to be addressed is offline buddy list management. What happens if you try to add a new user when you are in disconnected mode? One possible and totally justifiable approach would be to not allow offline editing of buddy lists, especially that no authentication facilities exist while the user is disconnected from the server. If security is less of a concern though, then it would be possible to allow for disconnected addition of newly discovered users, whose credentials could be verified through the central server the next time the user is connected. Some format of versioning would need to be used or re-used if offline editing is to be enabled.
- Message and Presence Routing: What used to be a presence and message routing challenge when implemented by the central server, turns into a discovery problem when in peer to peer disconnected environment. The client will now be responsible for polling (discovering) the presence of other users on the buddy list, and thus have enough information to establish peer to peer communication for presence updates and message delivery.
Multicast-DNS, presence broadcasting, and network id association (targetted polling) are some of the possible approaches that could be taken to address this challenge. Which one is to be used will depend to a great degree on the capabilities made available through the chosen networking infrastructure. Network id association might be the best approach when the ids are unique and easily discoverable in an IPv6 network, while multicast-DNS might be the best approach in a NATed IPv4 network.
Proposed approach
- Serverless use must be possible
- XMPP / Jabber is the protocol of choice
- Existing clients should be able to work
- Must be targetted to the goals of educating 3rd world kids
Technical Details & Discussion
All of this is being discussed further detail on Talk:Instant Messaging Challenges, in particular the issue on distributed servers or serverless usage.
- I think this whole discussion needs to be refactored into several parts. One page on the existing IM application and how it might be enhanced or leveraged by other apps. Another page on a different and new IM infrastructure that some feel could be better by leveraging existing systems such as Jabber, XMPP and GAIM. And a third on messaging services in general and how they can be integrated into the educational environment. This last would be non-technical and focused on education.