Skip to main content Skip to navigation

Model for MSI Protein Expression in CRA Microscopy Images


New bioimaging techniques capable of visualising the co-location of numerous proteins within individual cells have been proposed to study tumour heterogeneity of neighbouring cells within the same tissue specimen. These techniques have highlighted the need to better understand the interplay between proteins in terms of their colocalisation.


We recently proposed a cellular-level model of the healthy and cancerous colonic crypt microenvironments. Here, we extend the model to include detailed models of protein expression to generate synthetic multiplex fluorescence data. As a first step, we present models for various cell organelles learned from real immunofluorescence data from the Human Protein Atlas. Comparison between the distribution of various features obtained from the real and synthetic organelles has shown very good agreement. This has included both features that have been used as part of the model input and ones that have not been explicitly considered. We then develop models for six proteins which are important colorectal cancer biomarkers and are associated with microsatellite instability, namely MLH1, PMS2, MSH2, MSH6, P53 and PTEN. The protein models include their complex expression patterns and which cell phenotypes express them. The models have been validated by comparing distributions of real and synthesised parameters and by application of frameworks for analysing multiplex immunofluorescence image data.


The six proteins have been chosen as a case study to illustrate how the model can be used to generate synthetic multiplex immunofluorescence data. Further proteins could be included within the model in a similar manner to enable the study of a larger set of proteins of interest and their interactions. To the best of our knowledge, this is the first model for expression of multiple proteins in anatomically intact tissue, rather than within cells in culture.


Download software and example data for healthy and moderately differentiated images.