Skip to main content Skip to navigation

Exploring the role of genetic variants on the onset and pathogenesis of immune-mediated diseases in diverse ethnic groups

Principal Supervisor: Dr Archana Sharma-OatesLink opens in a new window

PhD project title: Exploring the role of genetic variants on the onset and pathogenesis of immune-mediated diseases in diverse ethnic groups

University of Registration: University of Birmingham

Project outline:

The prevalence of Immune mediated diseases (IMDs) is increasing worldwide together with the associated high mortality and morbidity rates. IMDs are a group of immune mediated diseases that cause damage to tissues and organs in response to self-antigens1. Recent data from the UK support previous findings of systematic reviews reporting the highest incidence of SLE in the African-Caribbean population and the highest incidence of RA is seen in the South Asian population2,3. There have been other studies that have reported IMDs such as vitiligo and autoimmune thyroid disease to be higher in these ethnic groups.

Importantly, my previous work reported for the first time, that South Asian, African-Caribbean and Mixed-race/Other ethnic groups have an earlier age at onset for all the common IMDs in comparison to the White ethnic group (MS accepted in BMC Medicine). The number of years that the age of onset is earlier differs by the specific IMD and by ethnic group and ranges from 2 to 27 years. We were able to validate the earlier age at diagnosis of specific IMDs in UK Biobank (UKB). If the onset of IMDs is earlier in certain ethnic groups, then this would not only result in longer disease duration but also increase the risk of long-term disease complications which has implications for healthcare utilisation.

Genetic factors influencing earlier onset of IMDs in non-White ethnic groups

If minority ethnic populations are developing immune-mediated diseases at an earlier age, this may suggest that ethnicity influences immune responses as well as disease pathogenesis. For example, a polymorphism in the inflammasome component NLRP3 is associated with susceptibility to SLE and with IMDs in Latin American individuals4. However, we are not aware of any large-scale studies investigating polymorphisms associated with IMDs in minority ethnic groups to explain the earlier onset and severe disease. It is likely that certain risk alleles related to immune responses may be more prevalent in specific ethnic groups that causes increased susceptibility and an earlier onset of IMDs.

Aim: The overarching aim is to explore the UKB and other data sources to determine the role of genetics on the earlier onset of IMDs in minority ethnic groups.

Objectives: Use data from UKB database as well as other sources to:

  1. i) Use statistical and machine learning approaches to integrate clinical, biochemical, lifestyle factors with omics data sets for patients diagnosed with RA and psoriasis from non-white ethnic groups to identify variants associated with specific clinical parameters.
  2. ii) Integrate multimorbidity information. Chronic conditions characterized by pathological inflammation such as IMDs have been seen to be regulated by common networks of genes and similar underlying pathways despite the involvement of multiple organ systems by clustering prevalent diseases together and mapping common molecular pathways we can identify underlying canonical pathways to help guide biological understanding and treatment.


  1. Brinkworth JF et al. Curr Opin Immunol. 2014;31:66-78.
  2. Subramanian A et al. Clin Exp Allergy. 2020 Sep 18. doi: 10.1111/cea.13741.
  3. Maningding E et al. The California Lupus Surveillance Project.

4. Lee YH, Bae SC. Lupus 2016 25:1558-1566.


BBSRC Strategic Research Priority: Integrated Understanding of Health - Ageing

Techniques that will be undertaken during the project:

Command line shell scripts to run analyses on high performance computing cluster

R statistical programming language for statistical analyses

Python programming language for machine learning methods

Analysing whole genome sequencing data and variant prioritisation.

Various statistical and machine learning methods will be used to integrate diverse data sets to identify biomarkers for diagnostic and therapeutic intervention.


Contact: Dr Archana Sharma-OatesLink opens in a new window