by Paul Curzon, Queen Mary University of London
Based on a talk by Jiayu Song at QMUL, March 2023
Social media is full of people’s opinions, whether about politics, movies, things they bought, celebrities or just something in the news. However, sometimes there is just too much of it. Sometimes, you just want an overview without having to read all the separate comments yourself. That is where programs that can summarise text come in. The idea is that they take lots of separate opinions about a topic and automatically give you a summary. It is not an easy problem, however, and while systems exist, researchers continue to look for better ways.
That is what Queen Mary PhD student Jiayu Song is working on with her supervisor, Professor Maria Liakata. Some sources of opinions are easier to work with than others. For example reviews, whether of movies, restaurants or gadgets, tend to be more structured so more alike in the way they are written. Social media posts on the other hand are unlikely to have any common structure. What is written is much more ‘noisy’ and that makes it harder to summarise different opinions. Jiayu is particularly interested in summarising these noisy social media posts, so has set herself the harder problem of the two.
What does distance of meaning mean?
Think of posts to be summarised as points scattered on a piece of paper. Her work is based on the idea that there is a hypothetical point (so hypothetical social media post) that is in the middle of those other points (a kind of average point) and the task is to find that point so summary post. If they were points on paper then we could use geometry to find a central point that minimises the total distance to all of them. For written text we need first to decide what we actually mean by ‘distance’ as it is no longer something we can measure with a ruler! For text we want some idea of distance in meaning – we want a post that is as close as possible to those it is summarising but by “close” here we mean close in meaning. What does distance of meaning mean? King and Queen for example might be the same distance apart as boy and girl in meaning whereas tree is further away in meaning.
Jiayu’s approach is based on finding a middle point for posts using a novel (for this purpose) way of determining distance called the Wasserstein distance. It gives a way of calculating distances between distributions of probabilities. Imagine you collected the marks people scored in a test and plotted a graph of how many got each mark. That would give a distribution of marks (likely it would give a hump-like curve known as normal distribution.). This could be used to estimate the distribution of marks you would get from a different class. If we did that for lots of different classes each would actually have a slightly different distribution (so curve when plotted). A summary of the different distributions would be a curve as similar (so as “close”) as possible to all of them so a better predictor of what new classes might score.
From distance to distribution
You could do a similar thing to find the distribution of words in a book, counting how often each word arises and then plotting a curve of how common the different words are. That distribution gives the probability of different words appearing so could be used to predict how likely a given word was in some new book. For summarising, though it’s not words that are of interest but the meanings of words or phrases, as we want to summarise the meaning whatever the words that were actually used. If the same thing is expressed using different words, then it should count as the same thing. “The Queen of the UK died today.” and “Today, the British monarch passed away.” are both expressing the same meaning. It is not the distance apart of individual word meanings we want though, but of distributions of those meanings. Jiayu’s method is therefore first based on extracting the meanings of the words and working out the distribution of those meanings in the posts. However, it turns out it is useful to create two separate representations, one of these distributions of meanings but also another representing the syntax, so the structure of the words actually used too, to help put together the actual final written summary.
Once that decoding stage has been done, creating new versions of the texts to be summarised as distributions, Jiayu’s system uses that special Wasserstein distance to calculate a new distribution of meanings that represents the central point of all those that are being summarised. Even given a way to calculate distances there are different versions of what is meant by “central” point and Jiayu uses a version that helps with the next stage. That involves, a neural network based system, like those used for machine learning systems more generally, is used to convert the summary distributions back into readable text. That summary is the final output of the program.
Does it work?
She has run experiments to compare the summaries from her approach to existing systems. To do this she took three existing datasets of opinions, one from Twitter combining opinions about politics and covid, a second containing posts from Reddit about covid, and a final one of reviews posted on Amazon. A panel of three experts then individually rated the summaries from Jiayu’s system with those from two existing summarising systems. The experts were also given “gold standard” summaries written by humans to judge all the summaries against. They had to rate which system produced the best and worst summary for each of a long series of summaries produced from the datasets. The expert’s ratings suggested that Jiayu’s system preserved meaning better than the others, though did less well in other categories such as how fluent the output was. Jiayu also found that there was a difference when rating the more structured Amazon reviews compared to the other more noisy social media posts and in these two cases a different approach was needed to decode the summary generated back into actual text based on the extra syntax representation created.
Systems like Jiayu’s, once perfected, could have lots of uses: they could help journalists quickly get on top of opinions being posted of a breaking story, help politicians judge the mood of the people about their policies or just help the rest of us decide which movie to watch or whether we should buy some new gadget or not.
Perhaps you have an opinion of whether that would be useful or not?
More on …
- The women are still here [PORTAL]
- Natural Language Processing [PORTAL]
- Let buttons be buttons
Related Magazines …
EPSRC supports this blog through research grant EP/W033615/1.