by Paul Curzon, Queen Mary University of London

Mary Ann Horton was transitioning to female at the time that she made one of her biggest contributions to our lives with a simple computer science idea with a big impact: a program that allowed binary email attachments.
Now we take the idea of sending each other multimedia files – images, video, sound clips, programs, etc for granted, whether by email or social networks, Back in the 1970s, before even the web had been invented, people were able to communicate by email, but it was all text. Email programs worked on the basis that people were just sending words, or more specifically streams of characters, to each other. An email message was just a long sequence of characters sent over the Internet. Everything in computers is coded as binary: 1s and 0s, but text has a special representation. Each character has its own code of 1s and 0s, that can also be thought of as a binary number, but that can be displayed as the character by programs that process it. Today, computers use a universally accepted code called Unicode, but originally most adopted a standard code called ASCII. All these codes are just allocations of patterns of 1s and 0s to each character. In ASCII, ‘a’ is represented by 1100001 or the number 97, whereas A is 1000001 or number 65, for example. These are only 7 bits long and as computers store data in bytes of 8 bits at a time this means that not all patterns of binary (so representable numbers) correspond to one of the displayable characters that email messages were expected to contain by the programs that processed them.
That is fine if all you have done is used programs like text editors, that output characters so you are guaranteed to be sending printable characters. The problem was other kinds of data whether images or runnable programs, are not stored as sequences of characters. They are more general binary files, meaning the data is long sequences of byte-sized patterns of 1s and 0s and what those 1s and 0s meant depended on the kind of data and representation used. If email programs were given such data to send, pass on or receive, they would reject or more likely mangle it as not corresponding to characters they could display. The files didn’t even have to be non-character formats, as at the time some computer systems used a completely different code for characters. This meant text emails could also be mangled just because they passed through a computer using a different format of character.
Mary Ann realised that this was all too restrictive for what people would be needing computers to do. Email needed to be more flexible. However, she saw that there was a really easy solution. She wrote a program, called uuencode that could take any binary file and convert it to one that was slightly longer but contained only characters. A second program she wrote, called uudecode converted these files of characters back to the original binary file to be saved by the receiving email program exactly it was originally on the source program.
All the uuencode program did was take 3 bytes (24 bits) of the binary file at a time, split them into groups of 6 bits so effectively representing a number from 0 to 63, add 32 to this number so the numbers are now in the range 32 to 95 and those are the numbers so binary patterns of the printable characters that the email programs expected. Each three bytes were now 4 printable characters. These could be added to the text of an email, though with a special start and end sequence included to identify it as something to decode. uudecode just did this conversion backwards, turning each group of 4 characters back into the orginal three bytes of binary.
Email attachments had been born, and ever since communication programs, whether email, chat or social media, have allowed binary files, so multimedia, to be shared in similar ways. By seeing a coming problem, inventing a simple way to solve it and then writing the programs, Mary Ann Horton had made computers far more flexible and useful.
More on …
Related Magazines …
Subscribe to be notified whenever we publish a new post to the CS4FN blog.
This blog is funded through EPSRC grant EP/W033615/1.

