Skip to main content Skip to navigation

14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon

Project Overview

The document explores the transformative impact of large language models (LLMs) in the educational sector, particularly in materials science and chemistry, showcasing projects developed during a hackathon that leveraged LLMs for predictive modeling, automation, and knowledge extraction. It emphasizes the role of generative AI in enhancing research efficiency, improving data accessibility, and personalizing educational content. Key applications include tools that facilitate automated summarization, content generation, and personalized learning assessments, with the i-Digest tool serving as a notable example by summarizing video lectures and generating relevant questions to promote active learning and student engagement. Overall, the findings suggest that the integration of generative AI in education holds significant potential for enriching learning experiences and tailoring educational approaches to individual needs.

Key Applications

i-Digest

Context: An educational tool that provides online content, including video lectures and podcasts, designed for students and lifelong learners. It enhances learning by generating questions and summarizing content based on lecture recordings and audio materials.

Implementation: The tool utilizes OpenAI's Whisper model for audio transcription and GPT-3.5-turbo for summarization and question generation. Transcripts of video lectures are generated and used to create personalized learning experiences and assessment questions tailored to the content.

Outcomes: The implementation generates summaries, technical keywords, and comprehension questions that enhance the learning experience. It offers personalized learning and adapts over time based on student feedback.

Challenges: Challenges include the need for effective integration and continuous updating of content to ensure relevance, as well as difficulties in evaluating system performance due to the lack of suitable benchmarks. Further extensions may include conditioning on additional materials.

Implementation Barriers

Access and Cost

The usage costs of LLMs can become prohibitive, especially for larger datasets, making exploratory research economically unfeasible.

Proposed Solutions: Develop a free and open-source framework for fine-tuning LLMs to perform predictive modeling tasks.

Data Limitations

LLMs currently have limited interpretability and robustness, impacting their reliability in educational contexts.

Proposed Solutions: Systematic exploration of methods to improve robustness and interpretability.

Evaluation Barrier

Challenges in systematically evaluating the performance of the generated summaries and questions due to a lack of established benchmarks.

Proposed Solutions: Develop suitable benchmarks for assessment and consider extending the system to condition on additional learning materials.

Project Team

Kevin Maik Jablonka

Researcher

Qianxiang Ai

Researcher

Alexander Al-Feghali

Researcher

Shruti Badhwar

Researcher

Joshua D. Bocarsly

Researcher

Andres M Bran

Researcher

Stefan Bringuier

Researcher

L. Catherine Brinson

Researcher

Kamal Choudhary

Researcher

Defne Circi

Researcher

Sam Cox

Researcher

Wibe A. de Jong

Researcher

Matthew L. Evans

Researcher

Nicolas Gastellu

Researcher

Jerome Genzling

Researcher

María Victoria Gil

Researcher

Ankur K. Gupta

Researcher

Zhi Hong

Researcher

Alishba Imran

Researcher

Sabine Kruschwitz

Researcher

Anne Labarre

Researcher

Jakub Lála

Researcher

Tao Liu

Researcher

Steven Ma

Researcher

Sauradeep Majumdar

Researcher

Garrett W. Merz

Researcher

Nicolas Moitessier

Researcher

Elias Moubarak

Researcher

Beatriz Mouriño

Researcher

Brenden Pelkie

Researcher

Michael Pieler

Researcher

Mayk Caldas Ramos

Researcher

Bojana Ranković

Researcher

Samuel G. Rodriques

Researcher

Jacob N. Sanders

Researcher

Philippe Schwaller

Researcher

Marcus Schwarting

Researcher

Jiale Shi

Researcher

Berend Smit

Researcher

Ben E. Smith

Researcher

Joren Van Herck

Researcher

Christoph Völker

Researcher

Logan Ward

Researcher

Sean Warren

Researcher

Benjamin Weiser

Researcher

Sylvester Zhang

Researcher

Xiaoqi Zhang

Researcher

Ghezal Ahmad Zia

Researcher

Aristana Scourtas

Researcher

KJ Schmidt

Researcher

Ian Foster

Researcher

Andrew D. White

Researcher

Ben Blaiszik

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Kevin Maik Jablonka, Qianxiang Ai, Alexander Al-Feghali, Shruti Badhwar, Joshua D. Bocarsly, Andres M Bran, Stefan Bringuier, L. Catherine Brinson, Kamal Choudhary, Defne Circi, Sam Cox, Wibe A. de Jong, Matthew L. Evans, Nicolas Gastellu, Jerome Genzling, María Victoria Gil, Ankur K. Gupta, Zhi Hong, Alishba Imran, Sabine Kruschwitz, Anne Labarre, Jakub Lála, Tao Liu, Steven Ma, Sauradeep Majumdar, Garrett W. Merz, Nicolas Moitessier, Elias Moubarak, Beatriz Mouriño, Brenden Pelkie, Michael Pieler, Mayk Caldas Ramos, Bojana Ranković, Samuel G. Rodriques, Jacob N. Sanders, Philippe Schwaller, Marcus Schwarting, Jiale Shi, Berend Smit, Ben E. Smith, Joren Van Herck, Christoph Völker, Logan Ward, Sean Warren, Benjamin Weiser, Sylvester Zhang, Xiaoqi Zhang, Ghezal Ahmad Zia, Aristana Scourtas, KJ Schmidt, Ian Foster, Andrew D. White, Ben Blaiszik

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies