Skip to main content Skip to navigation

Machine Learning

Datasets

University of Warwick machine learning datasets are available via a publicly accessible server. This server is in a pre-production test phase so dataset links will change in the future and may become temporarily unavailable. For instance, we expect service to be unavailable between 6th Dec 2019 and 10th Dec 2019 due to electrical maintenance.

TEM and STEM Images/Crops datasets were collected by hundreds of Warwick scientists working on dozens of projects and therefore have a diverse constitution. All datasets can be downloaded by following the download link, entering the password and clicking the "Download all files" button located at the top-right of the download page.

Exit Wavefunctions

Multiple datasets containing 98340 wavefunctions simulated with clTEM. In addition, there are 1000 experimental focal series.

Info: 81.9 GB. Wavefunctions are in 64-bit complex (320, 320) numpy array files (.npy) that can be opened with np.load(). Focal series images are in TIFF format. Subdirectories can be downloaded together or individually, and include

  • wavefunctions_multiple_hq: n=3, multiple materials - 27.8 GB.
  • wavefunctions_multiple_unseen_train_hq: n=3, multiple materials, materials in training set - 1.2 GB.
  • wavefunctions_single_hq: n=3, single material - 3.7 GB.
  • wavefunctions_multiple_forth_hq: n=3, multiple materials, simulation hyperparameter ranges reduced by a factor close to 1/4 - 9.1 GB.
  • wavefunctions: n=1, multiple materials. See dataset_info.txt for partitioning into training, validation and test sets. - 28.6 GB.
  • unseen_train: n=1, multiple materials, materials in training set - 1.1 GB.
  • wavefunctions_single: n=1, single material - 3.7 GB.
  • experimental_focal_series: 1000 experimental focal series. Series have a quadratically increasing defocus sequence; however, they are at different spatial scales - 13.7 GB.
  • cifs: Downloaded from the COD and used for clTEM simulations - 203.9 MB.
  • url_lists: COD URLs cifs were downloaded from.

Download link: https://mycloud-test.warwick.ac.uk/s/BLmdcXYArZXsJaw

Password: W4rw1ck3m!

TEM 96x96

Full TEM images downsampled to 96x96 (see the code). Intended for rapid development.

Info: 607 MB. Images are in a (17266, 96, 96, 1) numpy array file (.npy) that can be opened with np.load().

Download link: https://mycloud-test.warwick.ac.uk/s/tGWA6H9dY3zs3GS

Password: W4rw1ck3m!

STEM 96x96

Full STEM images downsampled to 96x96 (see the code). Intended for rapid development.

Info: 695 MB. Images are in a (19769, 96, 96, 1) numpy array file (.npy) that can be opened with np.load().

Download link: https://mycloud-test.warwick.ac.uk/s/oZnAsLmrko6keE4

Password: W4rw1ck3m!

STEM Full Images

Full STEM images. Most are 2048x2048. Featured in this paper.

Info: 159.4 GB. 16227 images.

Download link: https://mycloud-test.warwick.ac.uk/s/kfytRgfMLw6kzzS

Password: W4rw1ck3m!

STEM Crops

Non-overlapping 512x512 crops from images in the STEM full images dataset. Featured in this paper.

Info: 157.3 GB. 110933 training, 21259 validation and 28877 test set crops, totalling 161069 crops.

Download link: https://mycloud-test.warwick.ac.uk/s/3dq93SPCnJ8RHkA

Password: W4rw1ck3m!

TEM Full Images

Full TEM images. Featured in this paper.

Info: 269.8 GB. 11350 training, 2431 validation and 3486 test images, totalling 17267 images.

Download link: https://mycloud-test.warwick.ac.uk/s/AqFN7tq2REz6GMd

Password: W4rw1ck3m!

Contributors

A list of scientists who may have contributed may be added here...

Contacts

Jeffrey M. Ede:

j.m.ede@warwick.ac.uk

Richard Beanland:

r.beanland@warwick.ac.uk