by Paul Curzon, Queen Mary University of London
Look out the window at the human-made world. It’s full of hard, geometric shapes – our buildings, the roads, our cars. They are made of solid things like tarmac, brick and metal that are designed to be rigid and stay that way. The natural world is nothing like that though. Things bend, stretch and squish in response to the forces around them. That provides a whole bunch of fascinating problems for computer scientists like Lourdes Agapito of Queen Mary, University of London to solve.
Computer scientists interested in creating 3-dimensional models of the world have so far mainly concentrated on modelling the hard things. Why? Because they are easier! You can see the results in computer-animated films like Toy Story, and the 3D worlds like Second Life your avatar inhabits. Even the soft things tend to be rigid.
Lourdes works in this general area creating 3D computer models, but she wants to solve the problems of creating them automatically just from the flat images in videos and is specifically interested in things that deform – the squishy things.
Look out the window and watch the world go by. As you watch a woman walk past you have no problem knowing that you are looking at the same person as you were a second ago – even if she becomes partially hidden as she walks behind the post box and turns to post a letter. The sun goes behind a cloud and the scene is suddenly darker. It starts to rain and she opens an umbrella. You can still recognise her as the same object. Your brain is pulling some amazing tricks to make this seem so mundane. Essentially it is creating a model of the world – identifying all the 3-dimensional objects that you see and tracking them over time. If we can do it, why can’t a computer?
Unlike hard surfaces, deformable ones don’t look the same from one still to the next. You don’t have to just worry about changes in lighting, them being partially hidden, and that they appear different from a different angle. The object itself will be a different shape from one still to the next. That makes it far harder to work out which bits of one image are actually the same as the ones in the next. Lourdes has taken on a seriously hard problem.
Existing vision systems that create 3D objects have made things easier for themselves by using existing models. If a computer already has a model of a cube to compare what it sees with, then spotting a cube in the image stream is much easier than working it out from scratch. That doesn’t really generalise to deformable objects though because they vary too much. Another approach, used by the film industry, is to put highly visible markers on objects so that those markers can be tracked. That doesn’t help if you just want to point a camera out the window at whatever passes by though.
Lourdes aim is to be able to point a camera at a deformable object and have a computer vision system be able to create a 3D model simply by analysing the images. No markers, no existing models of what might be there, not even previous films to train it with, just the video itself. So far her team have created a system that can do this in some situations such as with faces as a person changes their expression. Their next goal is to be able to make their system work for a whole person as they are filmed doing arbitrary things. It’s the technical challenge that inspires Lourdes the most, though once the problems of deformable objects are solved there are applications of course. One immediately obvious area is in operating theatres. Keyhole surgery is now very common. It involves a surgeon operating remotely, seeing what they are doing by looking at flat video images from a fibre optic probe inside the body of the person being operated on. The image is flat but the inside of the person that the surgeon is trying to make cuts in is 3-dimensional. It would be far less error prone if what the surgeon was looking at was an accurate 3D model of the video feed rather than just a flat picture. Of course the inside of your body is made of exactly the kind of squishy deformable surfaces that Lourdes is interested in. Get the computer science right and technologies like this will save lives.
At the same time as tackling seriously hard if squishy computer science problems, Lourdes is also a mother of three. A major reason she can fit it all in, as she points out, is that she has a very supportive partner who shares in the childcare. Without him it would be impossible to balance all the work involved in leading a top European research team. It’s also important to get away from work sometimes. Running regularly helps Lourdes cope with the pressures and as we write she is about to run her first half marathon.
Lourdes may or may not be the person who turns her team’s solutions into the applications that in the future save lives in operating theatres, spot suspicious behaviour in CCTV footage or allow film-makers to quickly animate the actions of actors. Whoever does create the applications, we still need people like Lourdes who are just excited about solving the fundamental problems in the first place.
This blog is funded through EPSRC grant EP/W033615/1.