Richard Fox

Research Interests

My interests lie in modelling the emergent behaviour of humans, this involves using reinforcement learning (RL) techniques to model the individual decision making process for performing actions and its effects on the emergent behaviour. In particular, combining elements from social science, game theory and multi-agent RL to look for new solutions to/descriptions of observed phenomena in traffic scenarios that pertain to pedestrians and autonomous vehicles, i.e the human-AI interface.

Supervised by Prof. Elliot Ludvig

Masters Project

Model Comparison for the Two Stage Decision Task - Presented here are several reinforcement learning models that propose to model sequential action choice behaviour, based on explanatory theory from neuroscience and psychology. Three recent models are focused on, each of which builds on the others. However, none of these models perform particularly well, with likelihoods of ∼ O(10 −80 ), neither do they outperform each other with any significance. There are a multiplicitude of factors that can contribute to their performance, which are explored from computational underpinnings, to form of the likelihood estimation function

PhD Projects

Implicit Agent Modelling for Explicit Others - There are several ways to induce the other agent inference we are looking for, but there are 3 main concepts:

- Introduce rest/idle time penalties, as opposed to time penalties for taking long routes, this can be used to incentivise the agent to fill their time productively until all agents have completed their task
- conflict/collision minimisation, in particular sending an agent back to their previous co-ordinates after a collision event in a environment with heavy time penalties should incentivise their policy to avoid this
- Introduce a reputation, so that actions that have negative effects on other agents have a small but cumulative effect on how much an agent is assisted or considered by the other agents.

Agents with goals that may ad-hoc conflict

Agents in MuJoCo simulator interacting with objects

SUMO network definition of simple pedestrain crossing

example of pedestrain crossing in SMARTS simulator

PaAVI - Pedestrian and Autonomous Vehicle Interactions - Much work has been done concerning autonomous vehicle and human driver interactions, and the need for intrinsic models or at least categorisations of human behaviour have been shown to improve performance. We are proposing to look at the human autonomous vehicle (AV) interface instead through interactions with pedestrians in non controlled settings. Taking the form of an AV controlled by an algorithm that has used a form of reinforcement learning to develop a method for navigating pedestrian-road intersections.

We surveyed 136 participants on the behaviour and safety of a Deep Q Network model trained in the SMARTS simulator and then given slight variations of a reward filter that negatively impacted reward based on proximity to the pedestrian.

Analysis of results currently ongoing.

In uncontrolled situations there is an interplay between strategy and safety constrained optimisation. I.e. an algorithm that puts the safety of others first is vulnerable to strategic exploitation; if it is guaranteed to give way or halt at a certain risk factor, then a pedestrian can exploit this by raising the risk factor, as perceived by the AV, and gaining right of way consistently. Whilst also posing no risk increase to themselves as the AV is guaranteed to allow them to pass. We propose to model this as a Stackelberg game where the AV would be the leader and "announce" their action, assuming that the pedestrian will respond in the way that benefits them most. Within this paradigm there still exists a couple of problems to be addressed.

By extending the simulator to allow participants to interact in the scenario as the pedestrian, we hope to gather data on the degree of exploitative behaviour and discrepancies from reported actions to actions actually taken.

Future Work

Head Shot

Contact

LinkedInLink opens in a new window

richard.fox@warwick.ac.uk
B1.29 Zeeman Building

Research Interests

Decision making behaviour

Reinforcement Learning

Undergraduate

MPHYS - University of Salford