Skip to main content Skip to navigation

Events @ Warwick Chemistry

Show all calendar items

'Decentralized Materials Research Data Management, Curation and Dissemination for Accelerated Discovery', Dr Matthew Evans

- Export as iCalendar
Location: A1.01 (Zeeman building) and via Teams

Decentralized Materials Research Data Management, Curation and Dissemination for Accelerated Discovery

Dr Matthew Evans

Université Catholique de Louvain, Belgium; Matgenix SRL, Belgium; datalab industries ltd., United Kingdom

The primary barrier to widespread adoption of AI-accelerated materials science is the availability and quality of data. Researchers lack frictionless tooling and have limited incentive to record their data in such a way that is immediately amenable for machine learning, whether by them or by others. This talk introduces two data projects in the materials space that aim to lower the barrier to data access and curation by both humans and machines: the OPTIMADE federation of materials databases, and the open-source datalab materials data management platform.

OPTIMADE consists of an international consortium of databases that have designed, over many years, a common application programming interface (API) format, which now allows for 30+ databases across 20+ providers to be seamlessly queried. Such federated data unification enables decentralized data-driven workflows in materials informatics and beyond, from materials selection up to materials discovery. OPTIMADE is supported by several community-oriented tools that allow others to easily contribute their data to this growing ecosystem. This talk will introduce the OPTIMADE ecosystem, discuss the process of consensus-forming amongst providers, and outline how OPTIMADE could be extended to other domains.

The second project primarily concerns experimental data; datalab is a open-source data management platform that can be customized and adopted by materials research groups to allow for straightforward provenance tracking of samples, devices and raw data. It integrates with the broad open-source community of file format parsers (from the datatractor initiative and other popular packages) to allow for data normalization and simple analysis in the browser for many characterisation techniques (XRD, NMR, Raman, electrochemistry, etc). This platform provides the traditional benefits of having a digital system of record (e.g., an electronic lab notebook), whilst also enabling programmatic re-use of data across a research group via its API, with the aim to allow end user programming. By providing labs with control over their data platform, they can develop their own AI-driven developments, as well as selectively sharing and collaborating with others on shared workflows and samples. This talk will summarize the ongoing developments of datalab, including the integration of AI-based agents, and motivate future use cases of a federation of such datalab deployments.

 

Teams link: https://teams.microsoft.com/l/meetup-join/19%3ac7f48fb5df954f58949b248d309eecce%40thread.tacv2/1740581683921?context=%7b%22Tid%22%3a%2209bacfbd-47ef-4465-9265-3546f2eaf6bc%22%2c%22Oid%22%3a%2277336936-0500-449b-9db0-6331a0c4e368%22%7d

 

Following the seminar we will have a reception in G-block at 4pm, all are welcome to join!

 

Show all calendar items

Let us know you agree to cookies