Operational Transformation

Algorithms for writing together

by Paul Curzon, Queen Mary University of London

How do online word processing programs manage to allow two or more people to change the same document at the same time without getting in a complete muddle? One of the really key ideas that makes collaborative writing possible was developed by computer scientists, Clarence Ellis and Simon Gibbs. They called their idea ‘Operational transformation’.

Let’s look at a simple example to illustrate the problem. Suppose Alice and Bob share a document that starts:

"MEETING AT 10AM"

First of all one computer, called the ‘server’, holds the actual ‘master’ document. If the network goes down or computers crash then its that ‘master’ copy that is the real version everyone sees as the definitive version.

Both Alice and Bob’s computers can connect to that server and get copies to view on their own machines. They can both read the document without problem – they both see the same thing. But what happens if they both start to change it at once? That’s when things can get mixed up.

Let’s suppose Alice notices that the time in the document should be PM not AM. She puts her cursor at position 14 and replaces the letter there with P. As far as the copy she is looking at is concerned, that is where the faulty A is. Her computer sends a command to the server to change the master version accordingly, saying

CHANGE the character at POSITION 14 to P.

The new version at some point later will be sent to everyone viewing. However, suppose that at the same time as Alice was making her change, Bob notices that the meeting is at 1 not 10. He moves his cursor to position 13, so over the 0 in the version he is looking at, and deletes it. A command is sent to the server computer:

DELETE the character at POSITION 13.

Now if the server receives the instructions in that order then all is ok. The document ends up as both Bob and Alice intended. When they are sent the updated version it will have done both their changes correctly:

"MEETING AT 1PM"

However, as both Bob and Alice are editing at the same time, their commands could arrive at the server in either order. If the delete command arrives first then the document ends up in a muddle as first the 13th position is deleted giving.

"MEETING AT 1AM"

Then, when Alice’s command is processed the 14th character is changed to a P as it asks. Unfortunately, the 14th character is now the M because the deleted character has gone. We end up with

"MEETING AT 1AP"

Somehow the program has to avoid this happening. That is where the operational transformation algorithm comes in. It changes each instruction, as needed, to take other delete or insert instructions into account. Before the server follows them they are changed to ones so that they give the right result whatever order they came in.

So in the above example if the delete is done first, then any other instructions that arrive that apply to the same initial version of the document are changed to take account of the way the positions have changed due to the already applied deletion. We would get and so apply the new instructions:

STARTING FROM "MEETING AT 10AM"
DELETE the character at POSITION 13.
CHANGE the character at POSITION (14-1) to P.

Without Operational Transformation two people trying to write a document together would just be frustrating chaos. Online editing would have to be done the old way of taking it in turns, or one person making suggestions for the other to carry out. With the algorithm, thanks to Clarence Ellis and Simon Gibbs, people who are anywhere in the world can work on one document together. Group writing has changed forever.


This article was originally published on the CS4FN website.

More on …


This blog is funded through EPSRC grant EP/W033615/1.

The original version of this article was funded by the Institute of Coding.

Understanding Parties

Three glasses of lemonade in a huddle as if talking

Image by Susanne Jutzeler, Schweiz 🇨🇭 💕Thanks for Likes from Pixabay
Image by Susanne Jutzeler, Schweiz 🇨🇭 💕Thanks for Likes from Pixabay 

by Paul Curzon, Queen Mary University of London

(First appeared in Issue 23 of the CS4FN magazine “The women are (still) here”)

The stereotype of a computer scientist is someone who doesn’t understand people. For many, how people behave is exactly what they are experts in. Kavin Narasimhan is one. When a student at QMUL she studied how people move and form groups at parties, creating realistic computer models of what is going on.

We humans are very good at subtle behaviour, and do much of it without even realising it. One example is the way we stand when we form small groups to talk. We naturally adjust our positions and the way we face each other so we can see and hear clearly, while not making others feel uncomfortable by getting too close. The positions we take as we stand to talk are fairly universal. If we understand what is going on we can create computational models that behave the same way. Most previous models simulated the way we adjust positions as others arrive or leave by assuming everyone tries to both face, and keep the same distance from, the midpoint of the group. However, there is no evidence that that is what we actually do. There are several alternatives. Rather than pointing ourselves at some invisible centre point, we could be subconsciously maximising our view of the people around. We could be adjusting our positions and the direction we face based on the position only of the people next to us, or instead based on the positions of everyone in the group.

Kavin videoed real parties where lots of people formed small groups to find out more of the precise detail of how we position and reposition ourselves. This gave her a bird’s eye view of the positions people actually took. She also created simulations with virtual 2D characters that move around, forming groups then moving on to join other groups. This allowed her to try out different rules of how the characters behaved, and compare them to the real party situations.

She found that her alternate rules were more realistic than rules based on facing a central point. For example, the latter generates regular shapes like triangular and square formations, but the positions real humans take are less regular. They are better modelled by assuming people focus on getting the best view of others. The simulations showed that this was also a more accurate way to predict the sizes of groups that formed, how long they formed for, and how they were spread across the room. Kavin’s rules therefore appear to give a realistic way to describe how we form groups.

Being able to create models like this has all sorts of applications. It is useful for controlling the precise movement of avatars, whether in virtual worlds or teleconferencing. They can be used to control how computer-generated (CGI) characters in films behave, without needing to copy the movements from actors first. It can make the characters in computer games more realistic as they react to whatever movements the real people, and each other, make. In the future we are likely to interact more and more with robots in everyday life, and it will be important that they follow appropriate rules too, so as not to seem alien.

So you shouldn’t assume computer scientists don’t understand people. Many understand them far better than the average person. That is how they are able to create avatars, robots and CGI characters that behave exactly like real people. Virtual parties are set to be that little bit more realistic.

More on …

Related Magazines …


EPSRC supports this blog through research grant EP/W033615/1.