Projects/xomail

From OLPC
< Projects
Revision as of 06:20, 25 March 2008 by Shikhar (talk | contribs) (Thoughts on Search, Filtering and 'Smart tags')
Jump to: navigation, search

This page is geared towards Summer of Code 2008 work on an email activity. I am convinced it is possible to develop a functional and usable email client that implements a core feature set in the 10-week period that Google kindly sponsors.

Introduction

Currently there is a Gmail activity but no real email client that can be used in Sugar. The possibility of accessing/composing emails offline does not exist. An email client with mesh integration like direct sending to mesh buddies and other fancy features would be great, but the basic groundwork of a usable email activity is needed.

Collaboration tools are a very important part of the OLPC software bundle and an activity which brings email to the XO desktop and ties in with the environment would be a very useful addition.

Background

  • Dead Email project and related discussion Talk:Email
  • Notes from a former OLPC intern examining different email clients and recommendations. [1]
  • Tinymail has an email client that could do with better Sugar integration

Deliverables

  • A lightweight, functional email client with a child-friendly GUI
  • A daemon should be developed for sending of unsent messages and receiving of new email. [The rationale behind this is we can't assume the child will open the email activity when internet access is available.]

Some other requisites

  • POP, SMTP, and IMAP support, and also with Transport Layer Security
  • Should support Plaintext ASCII and MIME-encoded Unicode. Sane selection during composition.
  • Easy configuration on first run and later
  • Search should be central and helpful
  • Should have at least a basic address book
  • Should be able to handle large volumes of email and generally perform well

Approach/Ideas

I am very willing to adapt to feedback so please see this section as a place for me to thrash out approaches and ideas that I think make some sense.

Email organization

I would like to center email organization around tags and not folders. The Journal already uses tags, and for this activity I would like to extend them to have a visual representation as a GTK widget. They should be easily managed visually, for example dragging-and-dropping a tag onto a message should apply it.

Email sending/receiving, MIME-parsing and message construction

A pure-Python client is possible

  • email module for can be used for MIME parsing of incoming email, and message construction.
  • smtplib, poplib, imaplib can be utilized for email sending/receiving.

An alternative is to use Python/pygtk for just the GUI and rely on libcamel or Tinymail (which also builds upon libcamel). Tinymail already has Python bindings.

Storage

Develop an abstraction layer for storage-related requests.

It seems to me that traditional mailbox formats like mbox, maildir; are not very suitable if email is organized around tags.

sqlite can be used for storage in a database. Using a database for email storage is not a new idea, here is an account of someone's successful experiment for his purposes: http://www.sqlite.org/cvstrac/wiki?p=ExperimentalMailUserAgent.

The database schema would of course have to be very well thought out. There can be several tables in the database to keep performance good. Attachments can be detached and stored separately instead of having them in BLOB's.

Service descriptors

To make it easy and extensible to configure on first run for services such as Gmail, a file format for a service descriptor can be formalized.

The service descriptor would contain details about servers, protocols, junk-headers provided by the service, etc. Thus the only information required upon selection of a service should be username and password.

It should be possible to specify certain details in the service descriptor such as whether the service sets SpamAssassin headers, which IMAP folders are not to be downloaded. For example the Gmail service descriptor could specify that email in the 'All Mail', 'Spam' and 'Trash' folder is not to be downloaded, and that other folder names are to be interpreted as tags, since Gmail provides IMAP.

Thoughts on Search, Filtering and 'Smart tags'

  • For full text search, an option is to index incoming email with the sqlite fts module (Could be expensive in terms of flash space?)
  • A big use of filters is mailing lists and to that extent there should be automatic tagging based on mailing list headers.
  • Smart tags as first class tags, except they can't be applied to messages since they are dynamically evaluated for the query they represent, and in that sense are like a saved search.

Something interesting would be to formalize a common grammar for searches, filters and smart tags. Probably not beyond kids to pick up a simple domain specific language ;) Having GUI's for 'advanced search', configuring filters, etc is clunky.

Examples: I can search for "received:today", and also as easily create a smart tag called "today's email" using that string, or create a filter that applies tag "papa" to all emails I receive with "from:dad at smthn.org". The expressiveness could get a lot richer.

Message Threading

jwz's threading algorithm [2] can be used. It was proposed in the imapext-thread Internet Draft. There is also some python code for the same.[3]

It should be possible for the user to manually thread by drag-and-drop where the algorithm gets it wrong.

Spam filtering

In this stage of development I think it would be best to 'outsource' the spam filtering. So SpamAssassin headers can be supported. Using POP/IMAP with Gmail, spam is already filtered out by Gmail.

Contacts

A simple address book should be implemented for address auto-completion. Can later be made more of a real address book, or share data with a (future?) contacts activity.

UI

TODO: mockup

Other comments

Language of choice

A request on the Email page for a recursive name got an interesting reply: GOBOP Underperforms Because Of Python. While this might be an apt comment, I have a feeling that a well-designed Python activity can perform decently even in the context of an email client.

I favor Python since the functionality would be relatively easier to implement thanks to its RFC-compliant email libraries and bindings for gtk and sqlite. This would enable me to focus on usability as well. If my mentor is in agreement with this approach, I would definitely make clean abstractions so that if performance does indeed turn out to be an issue, the code can be adapted.

But I have much more experience working with C/C++, so I am also open to relying on the great work of open-source email projects.

UI design

I am happy to have the volunteer help of my brother who is a student of Information and Digital Design with SVG icons and usability suggestions.

Documentation

Especially important with regard to

  • service descriptor file format
  • the database schema, to enable exporting to other formats

Documentation may not come during the 10-week SOC period but I can commit to deliver on it.

Availability

I can spend 8+ hours everyday on this project, and communication with my mentor would not be a problem even if we are in different time zones because I am flexible in that regard ;-)

A timeframe

TODO