Benedict Russell

Summary

Hello, I am a PhD student at the MathSys II CDT, supervised by Dr Paolo Turrini and working with Dr Chin-wing Leung. My interests are in reinforcement learning, interacting particle systems, mean-field dynamics and convergence of evolving networks.

Publications, Preprints, & Past Projects

Mean-field imitation dynamics on fast assortative networks (preprint)

Andrew Nugent & Jacques Bara

We study imitation dynamics in a population of self-interested agents playing a continuous strategy Prisoner's Dilemma on a dynamically evolving weighted network. In the fast-network regime, we incorporate the edge weights into the strategy evolution before deriving and analysing the large population mean-field limit. Without noise, we establish well-posedness and show the solution collapses to a single Dirac mass. For initially separated clusters, we identify a payoff threshold and sufficient conditions for the overall level of cooperation to increase. We then introduce stochastic strategy updates, and obtain a non-local Fokker-Planck equation in the mean-field limit. We rigorously prove existence and uniqueness of stationary distributions, and show linear stability under sufficient noise. Numerics illustrate that noise can transform the deterministic consensus into stable cooperative stationary behaviour. These findings show that the fast adaptive interactions and stochastic exploration can jointly support the emergence of stable cooperation at a population level.

The Dynamics of Policy Gradient in Social Dilemmas with Partner Selection (preprint)

Paolo Turrini & Chin-wing Leung

We provide an analytical solution to the problem of policy-gradient dynamics in a multi-agent environment with partner selection. We show how partner selection changes the opponent distribution and hence the reward landscape, and prove this promotes cooperation under simple rules known from the literature. In particular, we find that population variance is a necessary condition for cooperation to emerge. Using a two-dimensional Wiener process, we extend the dynamics to capture the stochastic effects of partner selection and the resulting opponent distribution. We derive a sufficient condition for the population to be cooperation-promoting and prove the existence of a stationary distribution. Simulations confirm that the stochastic model accurately captures the policy-gradient dynamics and clarifies how the learning rate affects the emergence of cooperation.

Convergence of Replicator Dynamics in the Repeated Prisoner's Dilemma with Restarts (preprint)

Paolo Turrini & Chin-wing Leung

We investigate a population of self-interested agents playing a repeated Prisoner's Dilemma under the trigger-restart mechanism. Under such a mechanism, agents play a sequence of symmetric games with their partner, and restart the interaction if their actions disagree. Our work focuses on the convergence of replicator dynamics in a well-mixed population of agents, where the emergence of cooperation is challenged by the individual incentive for exploitation. Formulating the corresponding parametrised normal‐form game, with agents each adopting a length-m strategy sequence, we show that increasing the strategy length enables cooperation to emerge and stabilise. We provide exact convergence guarantees for restricted strategy lengths and, in the general payoff configuration, provide the necessary parametric conditions for the stability of cooperative strategies. By deriving an exact formula for the number of stable sequences, we find structural properties necessary for stability, as agents must learn to initially defect - the so-called "hazing period" - before cooperating indefinitely. Our analysis shows that, while optimal cooperative sequences exist, agents favour less-optimal sequences with a longer hazing period, which possess larger basins of attraction.

**Learning partner selection in optional social dilemmas without prior information (AAMAS 2026*)**

Paolo Turrini & Chin-wing Leung

We study repeated Prisoner’s Dilemma interactions where self-interested agents can opt out and be randomly rematched, but lack information about non-partners’ previous actions. Using multi-agent reinforcement learning, we show that cooperation can emerge without hard-wired partner selection: agents first learn to defect during a “hazing period,” then adopt reciprocal strategies such as Tit-for-Tat. They also learn to stay unconditionally in early interactions before using cooperation-promoting partner-selection rules, such as leaving defectors and staying with cooperators, with these behaviours scaling to longer interaction-length dependent policies.

* Best paper nominee

Collective Dynamics of Bounded ABPs

Supervised by: Professor Matthew Turner, Dr Gareth Alexander, Dr Michael Riedl
Collaborators: Luke Meredith, Luisa Estrada

We present two models for “weaselball” dynamics in confined environments. The first captures collective clockwise and counter-clockwise motion with a minimal set of interaction equations; the second applies Newtonian mechanics to a single weaselball on a circular boundary, yielding a closed-form expression for its steady-state propagation angle. Stability analysis of this latter model lead to a novel experimental design, whose results closely match our theoretical predictions.

Email: benedict.i.russell@warwick.ac.uk

Office: D1.04

Google Scholar

SIAM-IMA

I am President of the Warwick SIAM-IMA Student Chapter Link opens in a new windowwhich organises the weekly Statistics, Probability, Analysis and Applied Maths (SPAAM) seminar. We have a weekly seminar on Thursdays between 3-4pm. If you'd like to speak, please get in touch!

Conference and Talks

BMC-BAMC | University of Exeter | Invited Speaker for 'Dynamics on Complex Networks' mini-symposium
AMP25 | University of Oxford | Talk
MARL Workshop | Kings College London | Talk
MathSys Retreat | University of Warwick, April, 2024 | Poster on Collective Dynamics
SPAAM Seminar | University of Warwick, Dec 5th 2024 | Talk on 'Multi-Agent Manipulation of STV Elections'
Generative AI in Action: Building Production-Ready Solutions with Azure | Warwick, 28th May | Workshop by Microsoft
MathSys Retreat | University of Warwick, May 2026 | Poster on mean-field imitation
AAMAS 2026 | Cyprus, May 2026 | Talk
SPAAM Seminar | University of Warwick, Jun 18th 2026 | Talk
AMP26 | University of Warwick | Talk

Teaching Experience

Senior Graduate Teaching Assistant for

- MA3K1 Mathematics of Machine Learning (2025)
- CS404 Agent-Based Systems (2025)
- CS130 Mathematics for Computer Science (2024)
- MA146, MA139, MA145 (Marking)

Education

PhD Mathematics of Systems | University of Warwick
MSc Mathematics of Systems | Distinction | University of Warwick
BSc Mathematics | First-Class (Hons) | University of Edinburgh

Other Activities

President of Warwick SIAM-IMA Student Chapter, 2025/26

Vice-President of Warwick SIAM-IMA Student Chapter, 2024/25
- Organised the AMP 2025 Conference with University of Oxford
SSLC Chairman for MathSys 2024 - Current