Testing AIs in Minecraft

by Paul Curzon, Queen Mary University of London

What makes a good environment for child AI learning development? Possibly the same as for human child learning development: Minecraft.

A complex Minecraft world with a lake, grasslands, mountains and a building — Image by allinonemovie from Pixabay

Lego is one of the best games to play for impactful learning development for children. The word Lego is based on the words Play and Well in Danish. In the virtual world, Minecraft has of course taken up the mantle. A large part of why they are wonderful games is because they are open-ended and flexible. There are infinite possibilities over what you can build and do. They therefore help encourage not just focussing on something limited to learn as many other games do, but support open-ended creativity and so educational development. Given how positive it can be for children, it shouldn’t be surprising that Minecraft is now being used to help AIs develop too.

Games have long been used to train and test Artificial Intelligence programs. Early programs were developed to play and ultimately beat humans at specific games like Checkers, Chess and then later Go. That mastered they started to learn to play individual arcade games as a way to extend their abilities. A key part of our intelligence is flexibility though, we can learn new games. Aiming to copy this, the AIs were trained to follow suit and so became more flexible, and showed they could learn to play multiple arcade games well.

This is still missing a vital part of our flexibility though. The thing about all these games is that the whole game experience is designed to be part of the game and so the task the player has to complete. Everything is there for a reason. It is all an integral part of the game. There are no pieces at all in a chess game that are just there to look nice and will never, ever play a part in winning or losing. Likewise all the rules matter. When problem solving in real life, though, most of the world, whether objects, the way things behave or whatever, is not there explicitly to help you solve the problem. It is not even there just to be a designed distractor. The real world also doesn’t have just a few distractors, it has lots and lots. Looking round my living room, for example, there are thousands of objects, but only one will help me turn on the tv.

AIs that are trained on games may, therefore, just become good at working in such unreal environments. They may need to be told what matters and what to ignore to solve problems. Real problems are much more messy, so put them in the real world, or even a more realistic virtual world, to problem solve and they may turn out to be not very clever at all. Tests of their skills that are based on such tasks may not really test them at all.

Researchers at the University of Witwatersrand in South Africa decided to tackle this issue, but using yet another game: Minecraft. Because Minecraft is an open-ended virtual world, tackling challenges created in it will involve working in a world that is much more than just about the problem itself. The Witwatersrand team’s resulting MinePlanner system is a collection of 45 challenges, some easy, some harder. They include gathering tasks (like finding and gathering wood) and building tasks (like building a log cabin), as well as tasks that include combinations of these things. Each comes in three versions. In the easy version nothing is irrelevant. The medium version contains a variety of extraneous things that are not at all useful to the task. The hard version is in a full Minecraft world where there are thousands of objects that might be used.

To tackle these challenges an AI (or human) needs to solve not just the complex problem set, but also work out for themselves what in the Minecraft world is relevant to the task they are trying to perform and what isn’t. What matters and what doesn’t?

The team hope that by setting such tests they will help encourage researchers to develop more flexible intelligences, taking us closer to having real artificial intelligence. The problems are proposed as a benchmark for others to test their AIs against. The Witwatersrand team have already put existing state-of-the-art AI planning systems to the test. They weren’t actually that great at solving the problems and even the best could not complete the harder tasks.

So it is back to school for the AIs but hopefully now they will get a much better, flexible and fun education playing games like Minecraft. Let’s just hope the robots get to play with Lego too, so they don’t get left behind educationally.

Magazines …

Issue 29 – Diversity

Front cover of CS4FN issue 29 - Diversity in Computing

EPSRC supports this blog through research grant EP/W033615/1,

Free event for families and schools: the Christmas Lectures from the Royal Institution

Come and watch a TV programme being made!

Royal Institution, Prof Michael Wooldridge in the Faraday Lecture Theatre – photo credit: Paul Wilkinson

Every year two things happen in December: (1) someone gives a series of Christmas Lectures for young people at the Royal Institution in central London which the BBC film… and then (2) the programme is broadcast on telly over the Christmas holidays.

This year there’s an extra bit!

You can come and watch a livestream of the lectures being filmed in one of 20 venues around the UK and QMUL is one of those venues.

Thanks to a videolink from the Faraday Lecture Theatre in the Royal Institution we’ll be streaming the lectures as they are taking place, just a few miles away.

The Truth About AI

This year Prof Mike Wooldridge will be giving the series of lectues on artificial intelligence, definitely a hot topic!

We have FREE tickets for you and your family, school, scouts group, community group to come along and watch. You can attend one, two or all three talks if you like (but you don’t need to have attended an earlier talk to enjoy a later one). Tell your friends 🙂 We have a maximum of 100 spaces for each of the three livestreaming events.

When?

Tuesday 12th December (6pm – 8.30pm, doors 5.30pm)
Thursday 14th December (as above)
Saturday 16th December (as above)

Each talk itself is probably more like an hour long but because demos need to be re-set and things might need to be filmed from a different angle there will be some stopping and re-starting. We’ll have some activities to do during the breaks in recording.

Where?

We’ll be in Room ‘PP1’ at the People’s Palace on Mile End Road. It’s slightly nearer Stepney Green station than Mile End station and is on the 25 and 205 bus routes. It’s also wheelchair accessible (including loos). [Map link]

Who’s it for?

The lectures are aimed at 11-17 year olds but we’re looking forward to welcoming younger and older siblings.

Tickets

Our tickets at the People’s Palace are free, click or tap these links to secure your place. Please note that everyone in your group will need their own ticket for each event.

Our colleagues at the Neuron Pod in the Centre of the Cell (Whitechapel) are also livestreaming the lectures with a small fee (£3.50 per ticket) to cover costs.

All of the other venues screening the Christmas Lectures are listed here and the price of their tickets varies depending on local costs. Do get in touch with the Royal Institution (xmaslecs@ri.ac.uk) if cost is a barrier as there are some free tickets and discounts available.

More information

Note that these pages will take you to our Teaching London Computing website for teachers.

Main page | About the three lectures | About the livestream | FAQ | Watch on TV

Download a flyer

Questions?

Ask Jo, or see our Frequently Asked Questions page.

EPSRC supports this blog through research grant EP/W033615/1.

Putting your hand up a cow’s bottom

by Jonathan Black, Queen Mary University of London

(From the archive)

A cow looking up in discomfort — Image by Alexa from Pixabay

Simulators are a common way to train when gaining skills that are dangerous or difficult to practice for real. Pilots, for example, do lots of training on flight simulators. Doctors also use simulators to train for surgery, and the simulators are increasingly accurate. They can even make it feel like you’re working on the real thing by giving you feedback through your sense of touch: haptics. Practicing sinus or eye surgery and your practice sessions will feel real, for example. Haptics can help not just doctors, but vets training too – and it can help not just in the head but, errr, at the other end too.

Trainee vets have to learn how to feel for animals’ organs. In small animals like dogs and cats you can do that just by feeling the outside of their tummies, but in larger animals like cows or horses you have to actually put your hands inside them. That’s right, up there. Now the thing is, this is a very difficult thing to learn how to do properly. A teacher can’t demonstrate it, because the student can’t see what they’re doing. Likewise when the student tries it, the teacher can’t see to know if they’re doing it right. Usually they just rely on describing what they’re doing (and how the animal reacts, of course).

Fortunately for teacher, student and especially animal, Sarah Baillie and her colleagues at the Royal Veterinary College invented a simulator called the Haptic Cow. It’s a haptic model of a cow’s rear end, complete with ‘Ouchometer’ – a graph that shows whether the student’s movements are too gentle to be effective, just right, or too rough to be safe. By using the Haptic Cow, students get an accurate idea of what they’ll be doing in their real jobs, the teachers can see better feedback of how well the student’s doing, and real cows don’t have to worry about being practised on. For doctors, vets and their patients, haptics are helping to make sure that practice doesn’t have to mean petrifying.

Magazines …

EPSRC supports this blog through research grant EP/W033615/1.

Chatbot or Cheatbot?

by Paul Curzon, Queen Mary University of London

Speech bubbles
Image by Clker-Free-Vector-Images from Pixabay — IImage by Clker-Free-Vector-Images from Pixabay

The chatbots have suddenly got everyone talking, though about them as much as with them. Why? Because one, chatGPT has (amongst other things) reached the level of being able to fool us into thinking that it is a pretty good student.

It’s not exactly what Alan Turing was thinking about when he broached his idea of a test for intelligence for machines: if we cannot tell them apart from a human then we must accept they are intelligent. His test involved having a conversation with them over an extended period before making the decision, and that is subtly different to asking questions.

ChatGPT may be pretty close to passing an actual Turing Test but it probably still isn’t there yet. Ask the right questions and it behaves differently to a human. For example, ask it to prove that the square root of 2 is irrational and it can do it easily, and looks amazingly smart, – there are lots of versions of the proof out there that it has absorbed. It isn’t actually good at maths though. Ask it to simply count or add things and it can get it wrong. Essentially, it is just good at determining the right information from the vast store of information it has been trained on and then presenting it in a human-like way. It is arguably the way it can present it “in its own words” that makes it seem especially impressive.

Will we accept that it is “intelligent”? Once it was said that if a machine could beat humans at chess it would be intelligent. When one beat the best human, we just said “it’s not really intelligent – it can only play chess””. Perhaps chatGPT is just good at answering questions (amongst other things) but we won’t accept that as “intelligent” even if it is how we judge humans. What it can do is impressive and a step forward, though. Also, it is worth noting other AIs are better at some of the things it is weak at – logical thinking, counting, doing arithmetic, and so on. It likely won’t be long before the different AIs’ mistakes and weaknesses are ironed out and we have ones that can do it all.

Rather than asking whether it is intelligent, what has got everyone talking though (in universities and schools at least) is that chatGPT has shown that it can answer all sorts of questions we traditionally use for tests well enough to pass exams. The issue is that students can now use it instead of their own brains. The cry is out that we must abandon setting humans essays, we should no longer ask them to explain things, nor for that matter write (small) programs. These are all things chatGPT can now do well enough to pass such tests for any student unable to do them themselves. Others say we should be preparing students for the future so its ok, from now on, we just only test what human and chatGPT can do together.

It certainly means assessment needs to be rethought to some extent, and of course this is just the start: the chatbots are only going to get better, so we had better do the thinking fast. The situation is very like the advent of calculators, though. Yes, we need everyone to learn to use calculators. But calculators didn’t mean we had to stop learning how to do maths ourselves. Essay writing, explaining, writing simple programs, analytical skills, etc, just like arithmetic, are all about core skill development, building the skills to then build on. The fact that a chatbot can do it too doesn’t mean we should stop learning and practicing those skills (and assessing them as an inducement to learn as well as a check on whether the learning has been successful). So the question should not be about what we should stop doing, but more about how we make sure students do carry on learning. A big, bad thing about cheating (aside from unfairness) is that the person who decides to cheat loses the opportunity to learn. Chatbots should not stop humans learning either.

The biggest gain we can give a student is to teach them how to learn, so now we have to work out how to make sure they continue to learn in this new world, rather than just hand over all their learning tasks to the chatbot to do. As many people have pointed out, there are not just bad ways to use a chatbot, there are also ways we can use chatbots as teaching tools. Used well by an autonomous learner they can act as a personal tutor, explaining things they realise they don’t understand immediately, so becoming a basis for that student doing very effective deliberate learning, fixing understanding before moving on.

Of course, a bigger problem, if a chatbot can do things at least as well as we can then why would a company employ a person rather than just hire an AI? The AIs can now a lot of jobs we assumed were ours to do. It could be yet another way of technology focussing vast wealth on the few and taking from the many. Unless our intent is a distopian science fiction future where most humans have no role and no point, (see for example, CS Forester’s classic, The Machine Stops) then we still in any case ought to learn skills. If we are to keep ahead of the AIs and use them as a tool not be replaced by them, we need the basic skills to build on to gain the more advanced ones needed for the future. Learning skills is also, of course, a powerful way for humans (if not yet chatbots) to gain self-fulfilment and so happiness.

Right now, an issue is that the current generation of chatbots are still very capable of being wrong. chatGPT is like an over confident student. It will answer anything you ask, but it gives wrong answers just as confidently as right ones. Tell it it is wrong and it will give you a new answer just as confidently and possibly just as wrong. If people are to use it in place of thinking for themselves then, in the short term at least, they still need the skill it doesn’t have of judging when it is right or wrong.

So what should we do about assessment. Formal exams come back to the fore so that conditions are controlled. They make it clear you have to be able to do it yourself. Open book online tests that become popular in the pandemic, are unlikely to be fair assessments any more, but arguably they never were. Chatbots or not they were always too easy to cheat in. They may well be good still for learning. Perhaps in future if the chatbots are so clever then we could turn the Turing test around: we just ask an artificial intelligence to decide whether particular humans (our students) are “intelligent” or not…

Alternatively, if we don’t like the solutions being suggesting about the problems these new chatbots are raising, there is now another way forward. If they are so clever, we could just ask a chatbot to tell us what we should do about chatbots…

Related Magazines …

Issue 16 – Clean up your language

This blog is funded through EPSRC grant EP/W033615/1.

	Can you trust a smil… on Designing robots that car…
	Can you trust a smil… on Blade: the emotional comp…
	Can you trust a smil… on How to get a head in robotics…
	Can you trust a smil… on Computers that read emoti…
	Find your own time z… on Love your data

Category: Education

Testing AIs in Minecraft

More on …

Magazines …

Free event for families and schools: the Christmas Lectures from the Royal Institution

The Truth About AI

When?

Where?

Who’s it for?

Tickets

More information

Download a flyer

Questions?

Putting your hand up a cow’s bottom

More on …

Magazines …

Chatbot or Cheatbot?

More on …

Related Magazines …