The Hidden Code in Your Toy Adverts

A boy in blue and girl in pink playing on a beach
Image by Ben Kerckx from Pixabay

The music in a toy commercial isn’t just background noise. It tells you who the advert is for, and a machine learning model can hear it (even when you barely notice the difference). Luca Marinelli tells us more.

Next time you’re watching TV, try muting the adverts and then turning the sound back on. You’ll probably notice something odd. The music in adverts for dolls and playsets sounds completely different from the music in adverts for action figures and toy cars. One sounds smooth and tuneful. The other sounds loud and chaotic. But here’s the question: is that just your imagination, or is the difference real and measurable? For my PhD research at Queen Mary University of London I decided to find out using machine learning.

I collected over 600 toy commercials from a UK retailer’s YouTube channel, split into three groups: ads aimed at girls, ads aimed at boys, and ads aimed at mixed audiences. Then I fed the soundtracks into a computer program and had it extract dozens of measurements from each one. Not “does this sound nice?” (computers can’t answer that) but more precise numerical values like “how rough does the sound spectrum look”, “how regular is the beat” or “how clearly does this audio sit in a musical key”. Think of it as turning every piece of music into a long list of numbers that each describe a property of it.

Then I trained a type of machine learning model called a classifier, to look at those numbers and predict: is this intended as a girls’ ad, a boys’ ad, or a mixed one? The classifier got it right a remarkable 91% of the time when comparing girls-only and boys-only ads. That’s not luck. That’s a genuine, detectable pattern hidden in the sound. But which measurements were actually doing the work? This is where the research gets interesting, and where a technique called SHAP (Shapley Additive exPlanations) comes in. SHAP is a way of asking a machine learning model to explain its own decisions. Instead of just getting a yes/no answer, you can ask: “which features pushed you towards saying this was a girls’ ad, and which ones pushed you the other way?” It’s a bit like asking a judge not just for a verdict, but for their full reasoning.

What SHAP revealed was striking. Ads targeting girls consistently had higher harmonicity, meaning the sounds fit together into clear, pleasant musical patterns, and more rhythmic regularity, meaning the beat was steady and predictable. Their audio spectrum (a kind of fingerprint of all the frequencies present) was also broader and smoother. Boys’ ads, by contrast, scored higher on spectral roughness (sounds that are abrasive) and spectral entropy (a measure of how chaotic or unpredictable the sound is). They were also simply louder. In plain terms: girls’ ads sound harmonious and organised. Boys’ ads sound noisy, aggressive, and jagged. And a machine learning model can tell the difference with 91% accuracy just from the audio alone, without seeing a single frame of video. These patterns almost certainly aren’t accidental. Marketers are making deliberate choices about music to signal who a product is “for”. The sound itself carries a hidden message.

We showed how AI can be used to hold up a mirror to human behaviour. When we use explainable AI we can spot patterns in the world that are so familiar we’ve stopped noticing them. The music in a toy advert might seem trivial, but if an algorithm can reliably predict the intended audience just from the soundtrack, that tells us something important: gender stereotypes aren’t just visible, they’re audible too.

Luca Marinelli, Queen Mary University of London

More on …

Getting Technical…

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


Humanity’s Last Exam

Generative Artificial Intelligences (GenAI) can now pass exams we set for humans and even do better than many humans. They can do that even without being able to think in a way a human does, and certainly without being conscious. They are learning to reason and are combining that with having hoovered up all the knowledge we have generated and recorded whether on the web or elsewhere. In effect, they use it to predict what comes next. In an exam what comes next after a question is the answer, so that is what they generate. But how good are they at doing that, really? As good as a good school student? As good as a university student? A PhD student? A Professor? Better than any human? Is there any question we could come up with, as examiners representing the human race, that a GenAI couldn’t answer? The SafeAI Benchmark Competition “Humanity’s Last Exam” is an attempt to find out.

Computer systems including AI-based ones, are typically evaluated based on benchmark questions that assess their intelligence and performance. They are the equivalent of big standardised exams. However, as AI models have rapidly advanced, existing benchmarks have become too easy. The “Humanity’s Last Exam” competition aimed to change this by collecting a new benchmark set of exceptionally difficult questions. The aim was to push artificial intelligence to its limits by challenging it with truly expert-level questions. To stack the deck in our favour any AI aiming to pass needed to be an expert in every subject, not just one or two!

Experts from across the disciplines were challenged to come up with questions in their area that they thought an AI would not be able to answer. The competition was a big success. It attracted more than 1,000 researchers and other experts. They submitted questions (with the correct answers), spanning over 100 different subjects. From all these suggested questions a solid set were selected in three stages. 

First, came AI Evaluation: five of the best AI models of late 2024 attempted each question. If all failed it, then the question advanced to the next stage. Second came Expert Review: human experts refined and assessed the questions and answers. They had to make sure that the questions had a known answer that they were sure was correct. The questions also had to be clear. They couldn’t be ambiguous so that more than one answer might be considered correct. Finally, came the Final Selection: a panel of experts and organisers made the final call of which questions were actually to be used.

Out of over 70,000 submitted questions to stage 1, only 2,500 made it into the final benchmark, with the top 50 declared as winners, with the person submitting the question earning a prize. In addition, they were invited to become co-authors of the research paper accompanying the competition.

Two computer scientists from QMUL, Søren Riis and Marc Roth contributed multiple questions to the competition, and despite how many questions failed to make the grade, both were joint winners. Moreover, one of Marc’s questions was selected to be featured in the Nature paper about the results. 

But what does a good question look like? To see, lets look at one of Marc’s selected questions. It concerned the process of “discovering” a network, meaning visiting all the nodes of an unknown network. What does this involve? Imagine a mouse is placed in a maze and starts to explore it. The maze is a kind of network with nodes (the junctions) and edges (the paths between them). The mouse, as it explores, is discovering that network. Suppose it does it randomly. Whenever it reaches a junction, it chooses one of the outgoing directions totally at random and continues exploring in that direction. We are interested in several things: how long will it take a mouse, on average, to explore the entire maze? How often will any specific location be visited by the mouse? And how likely it is for the mouse to be at any specific location at the end of its exploration?

The AIs were asked about a variation of this in which the mouse uses a specific but cleverer random strategy as given in the question, rather than just choosing a direction to go in totally at random at each junction. The AIs had to predict the behaviour of a mouse following this new strategy on different types of mazes. Surprisingly perhaps, even the best AIs at the time of the competition (2024) were unable to solve the problem correctly. They all claimed that the updated strategy does not lead to any difference in the overall behaviour compared to the original naive random strategy, in terms of the things of interest (like time taken). This is wrong as there are actually clear differences in the behaviour resulting  from the two strategies. That was something that Marc himself was able to correctly work out: Humans: 1 (well at least if you are Marc), AIs: 0

The first version of the overall benchmark (so AI exam) was set and finalised in early 2025. The best two AIs (Open AI o1 and Deepseek R1) got about 8% of the questions right. One year later, Gemini 3 Pro achieved a staggering 38.3%! Its true performance might be even better since the benchmark set might still contain some ambiguous questions with no clear right answer and some questions where the given expert answers are only partially incomplete or incorrect. This is mainly believed to be a possibility in the areas of text-only chemistry and biology questions: so more work for the chemists and biologists!

Because of the need to continue to work on the questions to make sure they are definitely correct and unambiguous, the “Humanities Last Exam” team has now switched to working on the questions on a rolling basis, aiming to improve the questions over the coming years. The AIs are not going to be free from taking exams for some time come! But it may not be long before humanity runs out of questions. In the meantime, anyone thinking that human examiners need to just come up with better questions to avoid the problem of students asking AIs to answer questions for them had better think again. Even the best experts in the world are struggling to find questions no AI can answer. And if they can’t answer them this year, there is always next year, or the year after…

Marc Roth and Paul Curzon, Queen Mary University of London

More on…

Getting Technical

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


No pause for breath

A robot playing a keyboard
AI generated Image by Gerd Altmann from Pixabay

Before you read the article you should have a listen to this piece of music: “Walk My Walk” by Breaking Rust EXTERNAL, YouTube

In November, 2025, this catchy new country music song received lots of media attention. There’s nothing very unusual about that but what made this song unusual was that the whole thing (the words, the tune, even the singer) was created entirely by an artificial intelligence. There is no ‘Breaking Rust’, it’s all computer-generated. Now that you know that, does it make a difference to what you think of the song? 

Lots of people are uneasy about a piece of music that had almost no direct human input into its creation. Music is a creative thing, designed and created by people and it feels unsettling to have computers doing that: for many it feels a bit like cheating. This song sounds human but if you listen carefully the singer seems to be performing the super-human feat of singing long stretches of the tune without taking a breath! A computer can do that, but people need oxygen!

And what is the future, if we are happy to listen to machine created things, that can be cheaply generated? Far less work, so livelihood, for human creatives. This is already happening in the world of the illustrator where it is harder than ever for newly graduated illustrators to get a foot on the ladder. Is that what we want for song writers and musicians too? Eventually, even the people running the programs to initiate the creation won’t be needed. If you want to listen to a new country song, or a new band, you will be able to click a button (pay some cash) and get one tailored for you. The money will go direct to a tech billionaire, of course.

Another thing people are very uneasy about is how the AI learned to write in that style of music in the first place. Music AI tools have been trained on vast amounts of other people’s music and, not surprisingly, many of those musicians are angry that their hard work has been re-used without permission or payment. Some musicians and music companies are now fighting back. They’ve asked lawyers to help them work with the AI companies so that they won’t lose out – they can instead opt in to allow their music be used to train AI tools, and this time they’ll be paid. This is basically what happens when musicians use the ideas of other musicians. Famously, “I’ll Be Missing You” by American rapper Puff Daddy and American singer Faith Evans, for example, used a sample without asking from the Police song, “Every Breath You Take”. Sting sued and as a result gets all the royalties from the song (though then had similar disputes with the other members of the Police! 

A share of royalties might be a win for some of the musicians, and for the people who own the AI tools… but it still doesn’t solve how we might feel about AI music created by machines, or for future human musicians who might never get a break because new song writers can’t get a foot in the door. If you value people, you need to show it in what you watch, read and listen to!

Jo Brodie and Paul Curzon, Queen Mary University of London


More on …

Related Magazines


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


Music AI Kriss Kross Puzzle

A Kriss Kross Puzzle 
Puzzle design credit: https://puzzlemaker.discoveryeducation.com/criss-cross/

Puzzle design credit: https://puzzlemaker.discoveryeducation.com/criss-cross/

Download and print the puzzle

Answers are at the bottom of https://cs4fn.blog/bitof6 where you can also read a copy of the magazine articles about Music and Artificial Intelligence.

Clues

  • 1. _ _ _ _ _ a piece of text with musical symbols instead of letters that tells a performer which
    notes to play, also a piece of music that accompanies a film (5 letters)
  • 2. and 10. _ _ _ _ _ _ (6 letters) separation is when computer scientists use AI to take a piece of music
    and split it into its _ _ _ _ _ (5 letters) – read more about this in ‘Separate your stems
  • 3. The _ _ _ _ _ _ is the main part of the tune you might sing along to (6 letters)
  • 4. A piece of music is made up of lots of different _ _ _ _ _ (5 letters)
  • 5. We measure how loud something is in _ _ _ _ _ _ _ _ (8 letters)
  • 6. A sequence of instructions that tell a computer what to do _ _ _ _ _ _ _ _ _ (9 letters)
  • 7. If you halve the length of a guitar string the note is an _ _ _ _ _ _ (6 letters)
  • 8. A guitar-like harp-lute from Ghana _ _ _ _ _ _ _ _ (8 letters) – read more about this in ‘The day the music didn’t die
  • 9. How high or how low a musical note is _ _ _ _ _ (5 letters)
  • 10. (see 2.)

Jo Brodie, Queen Mary University of London


More on…

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


How machines “hear” music

Listen to a song and you might tap your feet. Computers can “listen” to music but they don’t have feet to tap! They don’t have ears or a brain either so they don’t “listen” in the way that you do. They use maths.

Turning sound into numbers

A computer is just a machine that does calculations on numbers. It doesn’t really “hear” music. To it everything is just numbers. Its programs convert sounds into numbers that it can do maths with.

When someone plucks a guitar, the string vibrates (wobbles back and forth). That sends a pulse of energy (a sound wave) through the air. Our ears detect that pulse. A computer measures the sound wave. A song has lots of different sound waves mixed together, and they can all be described with numbers that a computer measures.

One measurement is pitch – how high and squeaky or how low and rumbly the sound is. A guitar string playing a higher note vibrates faster than a lower note, sending its energy pulses into the air more quickly. We measure that as the number of sound waves arriving each second (called the frequency).

A wave that starts red then become blue as the waves squash together
If we could see a sound wave it might look a bit like this. The red sound wave has a lower frequency than the blue sound wave where the distance between each ‘wobble’ narrows. Image by CS4FN

The red and blue wavy line shows what a sound wave might look like if we could see it. The blue part of the wave is vibrating faster than the red so has a higher frequency. Humans hear it as a higher note, computers ‘hear’ it by sensing more soundwaves each second.

A wave that starts red then become blue as the waves squash together. A black wave matches it exactly aside from being taller.
Image by CS4FN

Another measurement is the volume, or how loud the sound is. That relates to how hard the guitarist plucked the string so how ‘tall’ the sound wave is. The wavy black line has the same frequency as the red and blue wave but the black sound wave is bigger: it has a larger amplitude. Humans hear it as louder, computers record bigger numbers.

Once a computer has recorded the measurements as numbers, it can then do maths on the numbers. That is where things get interesting. Programs can then change the numbers to make new and different sounds. Or they can use algorithms to generate their own numbers, then play them as music!

How loud?

Volume is measured in decibels (dB for short). A lower number means the sound is quieter, a higher number means it is louder. The loudest a UK car is allowed to be is 70 dB.

How loud do you think these sounds are?

Table with volumes
How Loud?
Sound    Volume
Car 70dB
Doorbell ?
Jet plane taking off ?
Breathing ?
Vacuum cleaner ?
Balloon Popping ?
Whispering ?
Rainfall ?
A robin singing ?
Loudest shout ever by a teacher ?

Answers at https://cs4fn.blog/bitof6/

Jo Brodie and Paul Curzon, Queen Mary University of London


More on…

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


All the notes?

A boy with headphones surrounded by swirling music
Boy listening to music image by Olena from Pixabay

There are infinitely many musical notes, just like there are infinitely many colours. That matters if you are designing a new digital musical instrument. You have a lot more choice than on a piano!

Octaves 

Most Western music is divided equally into groups of 12 notes (‘octaves’) that musicians use. The gap between any two notes sounds the same. This is known as equal temperament tuning. 

Activity: Play the 12 notes 

You can play the 12 notes of an octave on the online piano https://bit.ly/pianoCS4FN. Play Middle C (marked with a red dot), then press each key in turn including the black keys. Play 12 notes and you have played the 12 notes of an octave.

Music as colour

The rainbow picture (below) shows there are many colours to pick from not just red, orange, yellow… A set of crayons would be enormous if it included every possible colour! Instead you get a selection just as in the picture: we picked 3 colours equally spaced apart: red, yellow and blue. Western music does the same thing with sound, picking 12 notes that sound equally spaced.

A spectrum of colour running from red to blue with red, yellow and blue selected equal distances apart
Image by CS4FN

There are lots of other notes that you could sing within an octave. Traditional music often uses different sets of notes. The Arabic system divides an octave into 24 notes, for example. They have more ‘sound crayons’ to play with! You could even start singing on a low note and continually raise your pitch until you reached the higher note, like sweeping through every colour in a musical rainbow.

If you sing a note, then sing the same note but an octave higher (eg Middle C then the next highest C), your vocal cords are now vibrating twice as fast! The frequency of the top note is twice as high as the lower one. Your vocal cords doubled their speed.

Jo Brodie and Paul Curzon, Queen Mary University of London


More on…

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


Musical Algorithms

An octave on a piano marked as from C to the next C labelled as C1 and C2
Image (edited) by OpenClipart-Vectors from Pixabay

How can a machine generate music? It needs an algorithm to follow: instructions to tell it what to do, step by step. Here are two simple games to play that compose a random tune by algorithm.

Writing Notes

We need a way to write notes. We use letters A to G as on a piano. They repeat all the way up the white keys, so after G comes different higher versions of A, B, C again. We will use notes running from what is called Middle C in the middle of the piano to the next C up. This is called an octave. We will call the two Cs, C1 and C2.

Game 1: Random Jumps

Roll two dice and add the numbers. Write down the note given in the table for Game 1, so if they add to 2 or 3 write down C1, if 4 write down D…If 7 then you get to roll again, and so on. Keep going until you have written 15 notes to make a tune of 15 notes.

Table for Game 1 showing dice rolls and notes
2 or 3 - C1
4 - D
5 - E
6 - F
7 - Roll again
8 - G
9 - A
10 - B
11 or 12 - C2
Game 1 by CS4FN

Game 2: Up and Down

The second algorithm uses one die. First write down C1 then roll the die and do what it says in the Game 2 table. Each new note is based on the last note. If you roll a 1 then write down D (the next note UP from C1). Rolling a 6 means add a pause in the tune (write a dash). If the roll takes you beyond either C then you bounce back: so rolling a 4 when you last wrote C1 means you write C1 again. Rolling 5 from C1 bounces you up to E. Continue until you have 15 notes.

Table for Game 2 showing die rolls and action
1 - UP 1 note
2 - UP 2 notes
3 REPEAT note
4 - DOWN 1 note
5 - DOWN 2 notes
6 - PAUSE
Game 2 by CS4FN

Play your tunes

Play your tunes on any instrument or use a free online piano (see https://bit.ly/pianoCS4FN).

Are they any good? Does either game give better tunes? 

Good music isn’t just random notes. That is why we pay composers to come up with the really good stuff! Both human and machine composers learn more complicated patterns of what makes good music.

What do you think of our musical masterpiece?

On Game 1 we rolled 6 4 8 8 8 | 5 9 4 9 6 | 5 6 9 9 10 so our tune is F D G G G | E A D A F | E F A A B

Make your tunes special!

See how on the Bach Google Doodle page.

A cloud of stars
Starburst by CS4FN

Here’s what our tune sounds like once harmonies have been added.

Could you improve your tunes by tweaking the notes? Some people use simple algorithms to spark human creativity like that. Rock legend David Bowie helped write a program he then used to write songs. It took random sentences from different places, split them in half and swapped the parts over to give him ideas for interesting lyrics. It was possibly the first algorithm to help write hit songs.

A ‘note’ on bias

Think about the numbers that are rolled and the number of different ways that each number can be produced. For example with two dice (let’s call them ‘left’ and ‘right’) you can make the number 9 twice by rolling a 5 with the left and 4 with the right, or 4 with the left and 5 with the right. Same with 6 and 3. There are only two ways to roll a 2 (both dice have to show 1) or a 3 (a 1 and a 2 or a 2 and a 1). This is baked in to the process and so will affect the notes that appear most often.

Jo Brodie and Paul Curzon, Queen Mary University of London


More on…

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


The day the music didn’t die

Computer Scientists are working to support traditional music from around the world.

A seperewa is a traditional “harp-lute” musical instrument of the Akan people in Ghana, Africa. It has strings that are plucked a bit like a guitar. It is dying out because of the rise of western music. Researchers are now testing AIs that were trained on western music to see if they still work with such different seperewa music. They are also trying to understand exactly how this traditional music is different.

Protecting traditional instruments

Colonisers introduced European guitars to Ghana in the late 1800s and their sound began to influence and even replace seperewa music. Worried by this, in the mid-1900s people made recordings to preserve endangered seperewa music and to remind people what it sounds like. Ghanaian musicians are now reviving the seperewa, so we might continue to hear more of its lovely sound in future.

A view of a historical seperewa instrument side-on showing a large sounding box with strings attached to a neck, and stretched taut for playing.
A seperewa, adapted from a public domain image on Wikipedia.

AI to the rescue

A team of computer scientists and music experts have investigated recordings of seperewa music to see how well western AI tools can analyse that style of music, given it is tuned in a completely different way, so plays different notes to a western instrument.

First the team used one AI tool to separate the sounds of the seperewa from the singing. It struggled a bit and left some of the singing in the seperewa track and vice versa but overall did a good job,

They then used a different AI to analyse the sounds of the seperewa. The found that the seperewa music had its own, unique musical fingerprint, revealing a rich tapestry of sound that was clearly different from western music.

The research is helping to preserve a vital part of Ghanaian culture. It has shown in detail how their music is different to anything western and so that something unique and precious would be lost if it died out.

Jo Brodie and Paul Curzon, Queen Mary University of London


Watch …

Hear what a seperewa / seprewa sounds like at this YouTube video: The seprewa – the original African guitar [EXTERNAL]

More on…

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:

Getting technical…


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


Composing ancient Korean music

600 years ago King Sejong the Great of Korea published ‘Hangul’, a new and improved writing system for his people. To celebrate he asked his court scholars to write an epic poem in Hangul, then asked his musicians to compose music to accompany it. The result was Yongbieocheonga, or ‘Songs of the Dragon Flying to Heaven’.

It was performed by musicians playing wind and stringed instruments. The musical instruments the AI composed for are Daegeum and Piri (wind instruments), Haegeum and Ajaeng (bowed string instruments) and Geomungo and Gayageum (plucked string instruments). Each instrument had its own melody written out for the musician to follow. Only one piece of the written music survives fully intact (it is still performed!). Melodies of other pieces of music have survived but only for a single instrument. That means those pieces can’t be played by a group of musicians because all the other harmonies are missing.

A team of computer scientists decided to recreate the missing 15th century Korean harmonies from just the single melodies (in the way the Bach Google Doodle does, see You’ll Be Bach!). They wanted to expand the ability of their AI tools to make sense of music beyond western music.

They first taught their AI musician to recognise Korean music written in Hangul. Then, it learnt which notes sound best played together by different instruments. Finally, to generate music that could be played, it matched melodies and rhythms. 

It created a melody for each different instrument. The researchers then asked Korean musicians to perform the whole piece and to judge how well the AI musician had done. Happily, they thought that the music worked well and sounded correct. They could perform it with only a few small tweaks. 

You can listen to one of the performances and find out more below.

Jo Brodie and Paul Curzon, Queen Mary University of London


Watch…

More on …

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:

Getting technical…


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


Separate your stems

Two cartoon faces, both purple, but the one on the left is a bluer purple and the one on the is a redder purple. Two speech bubbles say "I have more blue" for the bluer purple and "I have more red" for the redder purple.
Image by CS4FN

AI can unmix music and isolate vocals

Purple can be created by mixing together red and blue paint. You can probably tell which of the faces in the image has more blue and which has more red. Does music work the same way?

Your brain can recognise the red and blue in purple while still seeing it as a whole colour. Music is similar. When you listen to a song your ears and brain hear all the sounds at once. The singing, guitars, drums and keyboard parts are mixed together, but you can also focus on the singing, or the keyboards or ….

Computer scientists have gone a step further with Artificial Intelligence. By training AI tools on lots of different songs they have taught them to do “source separation” – unmixing a recorded song back into its separate bits. Those separate bits are called stems. It is like taking purple paint and unmixing it to give blue and red again!

A wide grey vase with two flowers in it (one red, one blue) at opposite ends of the vase with their stems definitely very separated.
Stems adapted from a plant pot image by HASSAN DYB from Pixabay.

“Not that kind of stem!”

Did you know?

Photographer Todd McLellan photographs gadgets he’s carefully taken apart, to show all the pieces (search the web for his “Things Come Apart”). When a piece of music is blended together and an AI separates it again it’s a bit more like trying to un-bake a cake!

Jo Brodie and Paul Curzon, Queen Mary University of London


More on…

We have LOTS of articles about music, audio and computer science. Have a look in these themed portals for more:


The Music and AI pages are sponsored by the EPSRC (UKRI3024: DA EPSRC university doctoral landscape award additional funding 2025 – Queen Mary University of London).

Subscribe to be notified whenever we publish a new post to the CS4FN blog.