Skip to main content Skip to navigation

Extended Colorectal Cancer Grading Dataset


Dataset Details

This dataset is an extension of our existing CRC dataset. It comprises of 300 non-overlapping images of size greater than 4548×7548 pixels, extracted at 20× magnification. Each image is labelled as normal tissue (Grade1), low grade (Grade2) tumours or high grade (Grade3) tumours by expert pathologists. To obtain these images, we used more than 100 digitised WSIs of CRA tissue slides stained with H&E. All WSIs were taken from different patients and were scanned using the Omnyx VL120 scanner at 0.275 μm/pixel (40× magnification). In total 300 images were extracted, comprising 120 normal, 120 low grade and 60 high grade cancer images.


M. Shaban, R. Awan, M.M. Fraz, A. Azam, Y. Tsang, D. Snead, and N.M. Rajpoot
"Context-Aware Convolutional Neural Network for Grading of Colorectal Cancer Histology Images."
IEEE Transactions on Medical Imaging (2019). DOI: 10.1109/TMI.2020.2971006.


Digital histology images are amenable to the application of convolutional neural networks (CNNs) for analysis due to the sheer size of pixel data present in them. CNNs are generally used for representation learning from small image patches (e.g. 224 × 224) extracted from digital histology images due to computational and memory constraints. However, this approach does not incorporate high-resolution contextual information in histology images. We propose a novel way to incorporate a larger context by a context-aware neural network based on images with a dimension of 1792 × 1792 pixels. The proposed framework first encodes the local representation of a histology image into high dimensional features then aggregates the features by considering their spatial organization to make a final prediction. We evaluated the proposed method on two colorectal cancer datasets for the task of cancer grading. Our method outperformed the traditional patch-based approaches, problem-specific methods, and existing context-based methods. We also presented a comprehensive analysis of different variants of the proposed method.

Dataset Usage Rules

  1. The dataset provided here is for research purposes only. Commercial uses are not allowed.
  2. If you intend to publish research work that uses this dataset, you must cite our paper (as mentioned above), wherein the same dataset was first used.


Please download the dataset from this link.

Note: If you are unable to extract the zip file on Mac or Linux then try on extract it on a Windows machine.

Please send all comments, questions, and feedback related to this dataset to Nasir Rajpoot.