Photogrammetry for fun, preservation and research

Digitally stitching together 2D photographs to visualise the 3D world

Composite image of one green glass bottle made from three photographs. Image by Jo Brodie
Composite image of one green glass bottle made from three photographs. Image by Jo Brodie

Imagine you’re the costume designer for a major new film about a historical event that happened 400 years ago. You’d need to dress the actors so that they look like they’ve come from that time (no digital watches!) and might want to take inspiration from some historical clothing that’s being preserved in a museum. If you live near the museum, and can get permission to see (or even handle) the material that makes it a bit easier but perhaps the ideal item is in another country or too fragile for handling.

This is where 3D imaging can help. Photographs are nice but don’t let you get a sense of what an object is like when viewed from different angles, and they don’t really give a sense of texture. Video can be helpful, but you don’t get to control the view. One way around that is to take lots of photographs, from different angles, then ‘stitch’ them together to form a three dimensional (3D) image that can be moved around on a computer screen – an example of this is photogrammetry.

In the (2D) example above I’ve manually combined three overlapping close-up photos of a green glass bottle, to show what the full size bottle actually looks like. Photogrammetry is a more advanced version (but does more or less the same thing) which uses computer software to line up the points that overlap and can produce a more faithful 3D representation of the object.

In the media below you can see a looping gif of the glass bottle being rotated first in one direction and then the other. This video is the result of a 3D ‘scan’ made from only 29 photographs using the free software app Polycam. With more photographs you could end up with a more impressive result. You can interact with the original scan here – you can zoom in and turn the bottle to view it from any angle you choose.

A looping gif of the 3D Polycam file being rotated one way then the other. Image by Jo Brodie

You might walk around your object and take many tens of images from slightly different viewpoints with your camera. Once your photogrammetry software has lined the images up on a computer you can share the result and then someone else would be able to walk around the same object – but virtually!

Photogrammetry is being used by hobbyists (it’s fun!) but is also being used in lots of different ways by researchers. One example is the field of ‘restoration ecology’ in particular monitoring damage to coral reefs over time, but also monitoring to see if particular reef recovery strategies are successful. Reef researchers can use several cameras at once to take lots of overlapping photographs from which they can then create three dimensional maps of the area. A new project recently funded by NERC* called “Photogrammetry as a tool to improve reef restoration” will investigate the technique further.

Photogrammetry is also being used to preserve our understanding of delicate historic items such as Stuart embroideries at The Holburne Museum in Bath. These beautiful craft pieces were made in the 1600s using another type of 3D technique. ‘Stumpwork’ or ‘raised embroidery’ used threads and other materials to create pieces with a layered three dimensional effect. Here’s an example of someone playing a lute to a peacock and a deer.

Satin worked with silk, chenille threads, purl, shells, wood, beads, mica, bird feathers, bone or coral; detached buttonhole variations, long-and-short, satin, couching, and knot stitches; wood frame, mirror glass, plush”, 1600s. Photo CC0 from Metropolitan Museum of Art uploaded by Pharos on Wikimedia.

A project funded by the AHRC* (“An investigation of 3D technologies applied to historic textiles for improved understanding, conservation and engagement“) is investigating a variety of 3D tools, including photogrammetry, to recreate digital copies of the Stuart embroideries so that people can experience a version of them without the glass cases that the real ones are safely stored in.

Using photogrammetry (and other 3D techniques) means that many more people can enjoy, interact with and learn about all sorts of things, without having to travel or damage delicate fabrics, or corals.

*NERC (Natural Environment Research Council) and AHRC (Arts and Humanities Research Council) are two organisations that fund academic research in universities. They are part of UKRI (UK Research & Innovation), the wider umbrella group that includes several research funding bodies.

Other uses of photogrammetry

Examples of cultural heritage and ecology are highlighted in the post but also interactive games (particularly virtual reality), engineering and crime scene forensics and the film industry use photogrammetry, an example is Mad Max: Fury Road which used the technique to create a number of its visual effects. Hobbyists also create 3D versions (called ‘3D assets’) of all sorts of objects and sell these to games designers to include in their games for players to interact with.

Jo Brodie, Queen Mary University of London

More on …

Careers

This is a past example of a job advert in this area (since closed) for a photogrammetry role in virtual reality.

Also see our collection of Computer Science & Research posts.


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

3D models in motion

by Paul Curzon, Queen Mary University of London
based on a 2016 talk by Lourdes Agapito

The cave paintings in Lascaux, France are early examples of human culture from 15,000 BC. There are images of running animals and even primitive stop motion sequences – a single animal painted over and over as it moves. Even then, humans were intrigued with the idea of capturing the world in motion! Computer scientist Lourdes Agapito is also captivated by moving images. She is investigating whether it’s possible to create algorithms that allow machines to make sense of the moving world around them just like we do. Over the last 10 years her team have shown, rather spectacularly, that the answer is yes.

People have been working on this problem for years, not least because the techniques are behind the amazing realism of CGI characters in blockbuster movies. When we see the world, somehow our brain turns all that information about colour and intensity of light hitting our eyes into a scene we make sense of – we can pick out different objects and tell which are in front and which behind, for example. In the 1950s psychophysics* researcher Gunnar Johansson showed how our brain does this. He dressed people in black with lightbulbs fastened around their bodies. He then filmed them walking, cycling, doing press-ups, climbing a ladder, all in the dark … with only the lightbulbs visible. He found that people watching the films could still tell exactly what they were seeing, despite the limited information. They could even tell apart two people dancing together, including who was in front and who behind. This showed that we can reconstruct 3D objects from even the most limited of 2D information when it involves motion. We can keep track of a knee, and see it as the same point as it moves around. It also shows that we use lots of ‘prior’ information – knowledge of how the world works – to fill in the gaps.

Shortcuts

Film-makers already create 3D versions of actors, but they use shortcuts. The first shortcut makes it easier to track specific points on an actor over time. You fix highly visible stickers (equivalent to Johansson’s light bulbs) all over the actor. These give the algorithms clear points to track. This is a bit of a pain for the actors, though. It also could never be used to make sense of random YouTube or CCTV footage, or whatever a robot is looking at.

The second shortcut is to surround the action with cameras so it’s seen from lots of angles. That makes it easier to track motion in 3D space, by linking up the points. Again this is fine for a movie set, but in other situations it’s impractical.

A third shortcut is to create a computer model of an object in advance. If you are going to be filming an elephant, then hand-create a 3D model of a generic elephant first, giving the algorithms something to match. Need to track a banana? Then create a model of a banana instead. This is fine when you have time to create models for anything you might want your computer to spot.

It is all possible for big budget film studios, if a bit inconvenient, but it’s totally impractical anywhere else.

No Shortcuts

Lourdes took on a bigger challenge than the film industry. She decided to do it without the shortcuts: to create moving 3D models from single cameras, applied to any traditional 2D footage, with no pre-placed stickers or fixed models created in advance.

When she started, a dozen or so years ago, making any progress looked incredibly difficult. Now she has largely solved the problem. Her team’s algorithms are even close to doing it all in real time, so making sense of the world as it happens, just like us. They are able to make really accurate models down to details like the subtle movements of their face as a person talks and changes expression.

There are several secrets to their success, but Johansson’s revelation that we rely on prior knowledge is key. One of the first breakthroughs was to come up with ways that individual points in the scene like the tip of a person’s nose could be tracked from one frame of video to the next. Doing this well relies on making good use of prior information about the world. For example, points on a surface are usually well-behaved in that they move together. That can be used to guess where a point might be in the next frame, given where others are.

The next challenge was to reconstruct all the pixels rather than just a few easy to identify points like the tip of a nose. This takes more processing power but can be done by lots of processors working on different parts of the problem. Key to this was to take account of the smoothness of objects. Essentially a virtual fine 3D mesh is stuck over the object – like a mask over a face – and the mesh is tracked. You can then even stick new stuff on top of the mesh so they move together – adding a moustache, or painting the face with a flag, for example, in a way that changes naturally in the video as the face moves.

Once this could all be done, if slowly, the challenge was to increase the speed and accuracy. Using the right prior information was again what mattered. For example, rather than assuming points have constant brightness, taking account of the fact that brightness changes, especially on flexible things like mouths, mattered. Other innovations were to split off the effect of colour from light and shade.

There is lots more to do, but already the moving 3D models created from YouTube videos are very realistic, and being processed almost as they happen. This opens up amazing opportunities for robots; augmented reality that mixes reality with the virtual world; games, telemedicine; security applications, and lots more. It’s all been done a little at a time, taking an impossible-seeming problem and instead of tackling it all at once, solving simpler versions. All the small improvements, combined with using the right information about how the world works, have built over the years into something really special.

*psychophysics is the “subfield of psychology devoted to the study of physical stimuli and their interaction with sensory systems.”


This article was first published on the original CS4FN website and a copy appears on pages 14 and 15 in “The women are (still) here”, the 23rd issue of the CS4FN magazine. You can download a free PDF copy by clicking on the magazine’s cover below, along with all of our free material.

Another article on 3D research is Making sense of squishiness – 3D modelling the natural world (21 November 2022).


Related Magazine …


EPSRC supports this blog through research grant EP/W033615/1.