Shared revisions

There are a number of different ways to share the state of an activity with a group of collaborators / editors.

One is an unordered set of actions, which don't depend on one another. Any action can be undone independently of the others. (imagine a group of people adding bigts of colored light to a large prismatic collage; you can add or subtract light and color independent of all other lights/colors)

Another is a set of causal actions, which are partially ordered and do depend on one another. For instance a Write or Paint session, some of whose actions change actions that have come before.

There is also a punctuated equilibrium model which is fairly general...

punctuated equilibrium

Punctuated equilibrium supports a set of actions:

 sync
 undo to last sync
 soft-undo last action [of mine]
 soft-undo last action [of anyone's]
 save state
 fork
 soft-undo last action (of other) would be the last action before the sync, right?
 attempt merge [can happen while forked]
 push/suggest merge to others

<homunq> what is difference between merge and sync?
<_sj_> in between syncs, you have soft updates and attempted undo's
<_sj_> if it's obvious how to undo and there is nothing else that depends on the last action
<_sj_> you can undo it, whoever else did it; 
<_sj_> you can also try to undo your own action even if other changes hav eoccerred elsewhere, say if you are working in one part of a shared system and other changes are independent of it
<_sj_> (it = the part you are working on)
<_sj_> sync is basically a 'save state + push merge' where the merge seems automatic
<_sj_> you can also fork and make changes on your own, no longer accepting changes made by others, and later come back and attempt/suggest a merge to those being shared with
<_sj_> you need UI indicators and elements that lat you know when you are forked / out of sync
<_sj_> and when someone is pushing a sync
<_sj_> and that show you the expected diff b/t your current state and a suggested new state

punctuated soup (homunq's proposal)

A proposal for how to use a set of primitives that are implemented in an application-specific and data-structure-specific way, to give self-syncing power to arbitrary data structures.

required primitives

underlying state data
diffs: a two-way map from a pair of (old data state, new data state) to (old data state, diff)

diffs will generally be reversible (inverse of A is ~A); if not, you can store both directions for a reversible diff.
(optional) the ability to decompose a diff into disconnected subsets

merges: a map from (old data state, diff1, diff2) to (compatible changes, subdiff1, subdiff2) where (compatible changes) is independent of order of diffs, subdiff1 and subdiff2 are "minimal conflicting subsets" of diff1 and diff2. Note that a stupid algorithm is always available when all else fails: the identity mapping.

When a merge is applied, there is a rule to decide which one is given precedence. If diff1 wins, the new state is the result of (compatible changes, subdiff1), but the diff from that to (compatible changes, subdiff2) is also computed and placed on the redo (not undo) "stack".
If

        state1   ---diffB-->   state3
        \     ^                ^
         \     \              /
        diffA  diffD         diffC
           \     \          /
            V     \        /
             ------===state2

and (state2, diffC, diffD) gives (something, <empty>, <empty>), then (state1, diffA, diffB) should give (state2, <empty>, diffC).

how this gives self-syncing

With the above and causal ordering, it is possible to make a data structure which synchronizes itself, follows a generalized undo semantics, and may have desirable efficiency.

All participants in an activity have a "rank", and the highest ranking one visible is treated as the current hub. Assume that all instances start out aware that they are synchronized in "best benchmark" state A, which to start out with is also the "latest confirmed benchmark" between the hub and any other instance. Any actions by a non-hub instance are sent to the hub as a single compound diff from the instances latest possible benchmark with the hub. The hub performs a merge with any other actions (or compunded series of actions) posterior to that benchmark, and then broadcasts the diff from that benchmark to the new state, which becomes the new best benchmark. (...would be easier to code than to write in English. Sharing enough knowledge about who knows which benchmark so that you can, if space is short, forget benchmarks which are known to be superceded by something for every pair of instances; starting out by sending diffs from the latest possibly-shared-knowledge benchmark and falling back to working from an older, confirmed-common-knowledge benchmark...)

If an instance gets a broadcast merge result after it has performed new changes, it merges the new changes with the result, giving precedence to the result. The diff from the result to the merge is now sent back to the hub as a new action, superceding more-intelligent merges that the hub may have done meantime (this can only happen when communication is one-way for some reason). Again, changes that are inapplicable due to conflict are pushed on the local redo stack.

The redo stack is cleared whenever the local instance performs a new, incompatible action. However, a new action or merge result newdiff coming in from another instance will cause the redo to be recalculated from the current state (that is, the diff to the merge of the redo).

So, at any time, any instance will have a set of document histories and/or the equivalent chain of diffs leading to its current state. It can, if desired, forget (and/or compound bracketing diffs for) histories which are not possibly fork points (last common benchmarks) between it and another instance. Undo and redo are both stacks, but you can try to (un)apply them out-of-order: if you have 1 -A-> 2 -B-> 3 (now), then trying to undo A without undoing B gives the merge between B and ~A, with B given precedence. This is treated as a new action and placed on the stack. I can imagine some cool UIs for exploiting this possibility.

differences from other proposals

This is different from bemasc's proposal, as I understand it, because:

arbitrary data structures can be made to self-sync
Some negotiation is occasionally required to choose a common basis for a merge. If communication is good, this should not happen. Results can depend, even dramatically, on which basis is chosen.
In cases where communication is intermittent between more than two participants, and where the merge algorithm gets pathological for some reason, this could lead to changes getting lost or applied twice.
Except in cases with many participants and poor communication, this scheme allows a lot more space optimization by forgetting intermediate steps. This also allows the diff algorithm to be optimized for truly 3-way diffs, instead of trying to combine entire step-by-step series of actions. (Since this is a way in which the solution is more general, not optimizing in this way is still possible).

Homunq 13:03, 20 November 2008 (UTC) 05:20, 20 November 2008 (UTC) [[category:software]

Shared revisions

Contents

punctuated equilibrium

punctuated soup (homunq's proposal)

required primitives

how this gives self-syncing

differences from other proposals

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

About OLPC

About the laptop

About the tablet

Projects

OLPC wiki

Tools