14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon
Project Overview
The document explores the transformative impact of large language models (LLMs) in the educational sector, particularly in materials science and chemistry, showcasing projects developed during a hackathon that leveraged LLMs for predictive modeling, automation, and knowledge extraction. It emphasizes the role of generative AI in enhancing research efficiency, improving data accessibility, and personalizing educational content. Key applications include tools that facilitate automated summarization, content generation, and personalized learning assessments, with the i-Digest tool serving as a notable example by summarizing video lectures and generating relevant questions to promote active learning and student engagement. Overall, the findings suggest that the integration of generative AI in education holds significant potential for enriching learning experiences and tailoring educational approaches to individual needs.
Key Applications
i-Digest
Context: An educational tool that provides online content, including video lectures and podcasts, designed for students and lifelong learners. It enhances learning by generating questions and summarizing content based on lecture recordings and audio materials.
Implementation: The tool utilizes OpenAI's Whisper model for audio transcription and GPT-3.5-turbo for summarization and question generation. Transcripts of video lectures are generated and used to create personalized learning experiences and assessment questions tailored to the content.
Outcomes: The implementation generates summaries, technical keywords, and comprehension questions that enhance the learning experience. It offers personalized learning and adapts over time based on student feedback.
Challenges: Challenges include the need for effective integration and continuous updating of content to ensure relevance, as well as difficulties in evaluating system performance due to the lack of suitable benchmarks. Further extensions may include conditioning on additional materials.
Implementation Barriers
Access and Cost
The usage costs of LLMs can become prohibitive, especially for larger datasets, making exploratory research economically unfeasible.
Proposed Solutions: Develop a free and open-source framework for fine-tuning LLMs to perform predictive modeling tasks.
Data Limitations
LLMs currently have limited interpretability and robustness, impacting their reliability in educational contexts.
Proposed Solutions: Systematic exploration of methods to improve robustness and interpretability.
Evaluation Barrier
Challenges in systematically evaluating the performance of the generated summaries and questions due to a lack of established benchmarks.
Proposed Solutions: Develop suitable benchmarks for assessment and consider extending the system to condition on additional learning materials.
Project Team
Kevin Maik Jablonka
Researcher
Qianxiang Ai
Researcher
Alexander Al-Feghali
Researcher
Shruti Badhwar
Researcher
Joshua D. Bocarsly
Researcher
Andres M Bran
Researcher
Stefan Bringuier
Researcher
L. Catherine Brinson
Researcher
Kamal Choudhary
Researcher
Defne Circi
Researcher
Sam Cox
Researcher
Wibe A. de Jong
Researcher
Matthew L. Evans
Researcher
Nicolas Gastellu
Researcher
Jerome Genzling
Researcher
María Victoria Gil
Researcher
Ankur K. Gupta
Researcher
Zhi Hong
Researcher
Alishba Imran
Researcher
Sabine Kruschwitz
Researcher
Anne Labarre
Researcher
Jakub Lála
Researcher
Tao Liu
Researcher
Steven Ma
Researcher
Sauradeep Majumdar
Researcher
Garrett W. Merz
Researcher
Nicolas Moitessier
Researcher
Elias Moubarak
Researcher
Beatriz Mouriño
Researcher
Brenden Pelkie
Researcher
Michael Pieler
Researcher
Mayk Caldas Ramos
Researcher
Bojana Ranković
Researcher
Samuel G. Rodriques
Researcher
Jacob N. Sanders
Researcher
Philippe Schwaller
Researcher
Marcus Schwarting
Researcher
Jiale Shi
Researcher
Berend Smit
Researcher
Ben E. Smith
Researcher
Joren Van Herck
Researcher
Christoph Völker
Researcher
Logan Ward
Researcher
Sean Warren
Researcher
Benjamin Weiser
Researcher
Sylvester Zhang
Researcher
Xiaoqi Zhang
Researcher
Ghezal Ahmad Zia
Researcher
Aristana Scourtas
Researcher
KJ Schmidt
Researcher
Ian Foster
Researcher
Andrew D. White
Researcher
Ben Blaiszik
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Kevin Maik Jablonka, Qianxiang Ai, Alexander Al-Feghali, Shruti Badhwar, Joshua D. Bocarsly, Andres M Bran, Stefan Bringuier, L. Catherine Brinson, Kamal Choudhary, Defne Circi, Sam Cox, Wibe A. de Jong, Matthew L. Evans, Nicolas Gastellu, Jerome Genzling, María Victoria Gil, Ankur K. Gupta, Zhi Hong, Alishba Imran, Sabine Kruschwitz, Anne Labarre, Jakub Lála, Tao Liu, Steven Ma, Sauradeep Majumdar, Garrett W. Merz, Nicolas Moitessier, Elias Moubarak, Beatriz Mouriño, Brenden Pelkie, Michael Pieler, Mayk Caldas Ramos, Bojana Ranković, Samuel G. Rodriques, Jacob N. Sanders, Philippe Schwaller, Marcus Schwarting, Jiale Shi, Berend Smit, Ben E. Smith, Joren Van Herck, Christoph Völker, Logan Ward, Sean Warren, Benjamin Weiser, Sylvester Zhang, Xiaoqi Zhang, Ghezal Ahmad Zia, Aristana Scourtas, KJ Schmidt, Ian Foster, Andrew D. White, Ben Blaiszik
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai