How toxic content affects social media user engagement - new study

How toxic content affects social media user engagement - new study
Thursday 20 Feb 2025- A new study finds that reducing users’ exposure to toxic content on major social media platforms reduces their engagement across a variety of metrics, creating a dilemma for platforms which need engagement to survive
- Users whose feeds were filtered to reduce exposure to toxic content increased engagement with other unfiltered platforms
- Toxicity is contagious – users reading toxic posts are more likely to post their own hateful, profane or harassing content
- Profanity and hate speech have different effects on user welfare
- The study also offers insights into the benefits and limitations of automated toxicity detection.
With 5.24 billion social media accounts active around the world, decisions on what content is allowed and what should be restricted have global significance.
Social media providers are often accused of prioritising controversial content in order to maximise user engagement and increase their profits, without regard to welfare concerns for individuals or society as a whole. But the relationship between toxic content and engagement has not – until now - been proven.
In the first study of its kind, researchers from the University of Warwick, the University of Chicago and Columbia University recruited 742 volunteers to take part in a live experiment to explore how toxic content impacts user engagement on three major social media platforms.
Over six weeks in 2022, the volunteers used a custom-built browser plug-in to curate their social media feeds. During the experiment the volunteers consumed 11 million pieces of social media content across 30,000 hours of social media use.
Half the volunteers received whatever the Facebook, YouTube and Twitter algorithms served up for them. The other half received a filtered feed in which toxic content was hidden in real time. The volunteers did not know the specific way in which the browser plug-in curated their social media content.
The key findings, presented in Toxic Content and User Engagement on Social Media: Evidence from a Field Experiment, are:
- The average toxicity of text content seen by users in the moderated group was 73 per cent lower than the unfiltered group.
- Users in the moderated group engaged less across a basket of measures including time spent on the platform, ads clicked, and content consumed. For example:
-
- Active time spent fell by 9 per cent on Facebook and 7 per cent on YouTube
- Content consumed on Facebook fell by 23
- Adverts consumed fell by 27 per cent on Facebook and by 6 per cent on Twitter
- Ad clicks and post clicks decreased on both Facebook and Twitter
- Users in the moderated group spent on average 22 per cent more time each day on 38 other websites which were not moderated by the experiment, such as Reddit, discord, tumblr and telegram.
- Reducing exposure to toxic content reduced the average toxicity of content posted by the volunteers themselves on Facebook (by 30 per cent) and Twitter (by 25 per cent)
Dr Stalinski explains:
While this experiment is very clear that exposure to toxic content is a strong driver of social media engagement, the mechanisms behind this were less obvious.
The impact on welfare was also unclear from this first experiment. It is important not to assume that the welfare of the group who had reduced exposure to toxic content automatically improved.
To shed further light on these questions, a second online experiment asked 4,000 people to transcribe posts which varied both in their level of toxicity and the reason they had been so classified. Some posts were hateful but not profane; while others were profane but not hateful.
Dr Stalinski added:
The results of this larger study, using a different experimental design, were in alignment with our field experiment. Encountering more toxic posts increased the likelihood of clicking to see the comment sections by 18%, even though the comments were not part of the transcription task.
We also found that participants’ welfare was more adversely affected by hateful posts than profane ones. Overall, toxic posts triggered their curiosity, while profane posts specifically were also seen as more entertaining.
For the live experiment, toxic content was automatically hidden by a machine-learning algorithm trained to assess whether the content was likely to have been defined as toxic by more than 3 out of 10 human moderators. Over the six weeks of the experiment 7 per cent of posts, comments and replies met this threshold and were hidden.
The researchers note that their findings create a dilemma for social media providers. If the same approach were rolled out across the board, exposure to toxic content would fall – people would see fewer toxic posts and create fewer themselves – but engagement, ad clicks and content views would also fall, decreasing platform revenue; and users might migrate to less-moderated sites. It also does not necessarily follow that reduced toxicity enhances welfare – the welfare effects are more subtle and depend on the type of toxic content.
Dr Stalinski said:
Our evidence suggests that social media platforms’ private incentives to curtail toxicity may not be in alignment with social needs.
We therefore hope that our results will be useful to platforms, policymakers and regulators as they seek the right balance between freedom of speech and protection from harm.
We also hope that the tools to automatically detect and moderate toxic content that we experimentally assessed will be of interest to stakeholders such as social media platforms, online forums, and news sites who may wish to detect and hide toxic content in real time.
- Toxic Content and User Engagement on Social Media: Evidence from a Field Experiment. George Beknazar-Yuzbashev, Rafael Jiménez-Durán, Jesse McCrosky and Mateusz Stalinski. CAGE Working Paper 741/2025
- Download the full paper here.Link opens in a new window
- Read more about the study on the CAGE website here.