Could ChatGPT get an Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants
Project Overview
The document explores the integration of generative AI, specifically large language models (LLMs) like GPT-3.5 and GPT-4, in higher education, highlighting both their potential benefits and significant challenges. It emphasizes how these AI tools can enhance teaching and learning experiences by effectively answering a substantial percentage of standard university-level STEM assessment questions, which raises concerns about academic integrity and the need for revised assessment strategies. Performance evaluations demonstrate that while generative AI excels in various academic fields—such as Biology, Chemistry, Computer Science, and Physics—its effectiveness can vary based on prompting strategies and language proficiency. The findings suggest that generative AI has the capacity to improve educational outcomes, yet they underscore the importance of addressing inherent challenges, particularly regarding assessment integrity and the nature of questions posed to the models. Overall, the document calls for a careful consideration of how to harness the advantages of generative AI while mitigating risks to academic standards.
Key Applications
Assessment of generative AI performance in educational settings
Context: Higher education institutions across STEM fields, including engineering, computer science, life sciences, physics, mathematics, and chemistry courses. The focus is on evaluating AI-assisted responses to various types of assessment questions, including multiple-choice, open-ended, and complex problem-solving tasks.
Implementation: Utilized various generative AI models, including GPT-3.5 and GPT-4, to evaluate student responses to assessment questions across disciplines. This involved compiling datasets of assessment questions, using different prompting strategies, and analyzing performance based on difficulty levels and cognitive skills as outlined by Bloom's taxonomy.
Outcomes: AI models demonstrated a high capability in answering basic questions, achieving notable accuracy rates, and improving student engagement. However, they also showed lower performance on complex and higher-order problem-solving tasks, raising concerns about academic integrity and the understanding of critical concepts.
Challenges: AI struggles with complex question types, mathematical derivations, and maintaining high accuracy across different question formats. Variability in performance often depends on the prompting strategies used and the structure of the questions.
Implementation Barriers
Technical
Generative AI struggles with more complex question types and open-ended questions that require nuanced understanding or complex reasoning, leading to variability in performance.
Proposed Solutions: Revise assessment designs to incorporate more complex problem-solving that AI systems find challenging. Develop targeted training and fine-tuning of AI models on specific academic content.
Ethical
Concerns regarding academic integrity and potential misuse of AI tools for cheating.
Proposed Solutions: Implement AI-adversarial evaluation methods and emphasize ethical training for students.
Implementation
Difficulty in integrating AI tools within existing educational frameworks without compromising learning outcomes.
Proposed Solutions: Adapt educational assessments to include project-based evaluations, promoting the application of knowledge over rote memorization.
Performance Variability
Performance of AI models varies significantly based on the prompting strategy used and question complexity.
Proposed Solutions: Experiment with multiple prompting strategies to find optimal configurations for different question types.
Language Limitations
AI performance can be hindered by the language in which questions are posed, showing reduced effectiveness in non-English contexts.
Proposed Solutions: Enhance multilingual capabilities of models and use language-specific training datasets.
Project Team
Beatriz Borges
Researcher
Negar Foroutan
Researcher
Deniz Bayazit
Researcher
Anna Sotnikova
Researcher
Syrielle Montariol
Researcher
Tanya Nazaretzky
Researcher
Mohammadreza Banaei
Researcher
Alireza Sakhaeirad
Researcher
Philippe Servant
Researcher
Seyed Parsa Neshaei
Researcher
Jibril Frej
Researcher
Angelika Romanou
Researcher
Gail Weiss
Researcher
Sepideh Mamooler
Researcher
Zeming Chen
Researcher
Simin Fan
Researcher
Silin Gao
Researcher
Mete Ismayilzada
Researcher
Debjit Paul
Researcher
Alexandre Schöpfer
Researcher
Andrej Janchevski
Researcher
Anja Tiede
Researcher
Clarence Linden
Researcher
Emanuele Troiani
Researcher
Francesco Salvi
Researcher
Freya Behrens
Researcher
Giacomo Orsi
Researcher
Giovanni Piccioli
Researcher
Hadrien Sevel
Researcher
Louis Coulon
Researcher
Manuela Pineros-Rodriguez
Researcher
Marin Bonnassies
Researcher
Pierre Hellich
Researcher
Puck van Gerwen
Researcher
Sankalp Gambhir
Researcher
Solal Pirelli
Researcher
Thomas Blanchard
Researcher
Timothée Callens
Researcher
Toni Abi Aoun
Researcher
Yannick Calvino Alonso
Researcher
Yuri Cho
Researcher
Alberto Chiappa
Researcher
Antonio Sclocchi
Researcher
Étienne Bruno
Researcher
Florian Hofhammer
Researcher
Gabriel Pescia
Researcher
Geovani Rizk
Researcher
Leello Dadi
Researcher
Lucas Stoffl
Researcher
Manoel Horta Ribeiro
Researcher
Matthieu Bovel
Researcher
Yueyang Pan
Researcher
Aleksandra Radenovic
Researcher
Alexandre Alahi
Researcher
Alexander Mathis
Researcher
Anne-Florence Bitbol
Researcher
Boi Faltings
Researcher
Cécile Hébert
Researcher
Devis Tuia
Researcher
François Maréchal
Researcher
George Candea
Researcher
Giuseppe Carleo
Researcher
Jean-Cédric Chappelier
Researcher
Nicolas Flammarion
Researcher
Jean-Marie Fürbringer
Researcher
Jean-Philippe Pellet
Researcher
Karl Aberer
Researcher
Lenka Zdeborová
Researcher
Marcel Salathé
Researcher
Martin Jaggi
Researcher
Martin Rajman
Researcher
Mathias Payer
Researcher
Matthieu Wyart
Researcher
Michael Gastpar
Researcher
Michele Ceriotti
Researcher
Ola Svensson
Researcher
Olivier Lévêque
Researcher
Paolo Ienne
Researcher
Rachid Guerraoui
Researcher
Robert West
Researcher
Sanidhya Kashyap
Researcher
Valerio Piazza
Researcher
Viesturs Simanis
Researcher
Viktor Kuncak
Researcher
Volkan Cevher
Researcher
Philippe Schwaller
Researcher
Sacha Friedli
Researcher
Patrick Jermann
Researcher
Tanja Käser
Researcher
Antoine Bosselut
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Beatriz Borges, Negar Foroutan, Deniz Bayazit, Anna Sotnikova, Syrielle Montariol, Tanya Nazaretzky, Mohammadreza Banaei, Alireza Sakhaeirad, Philippe Servant, Seyed Parsa Neshaei, Jibril Frej, Angelika Romanou, Gail Weiss, Sepideh Mamooler, Zeming Chen, Simin Fan, Silin Gao, Mete Ismayilzada, Debjit Paul, Alexandre Schöpfer, Andrej Janchevski, Anja Tiede, Clarence Linden, Emanuele Troiani, Francesco Salvi, Freya Behrens, Giacomo Orsi, Giovanni Piccioli, Hadrien Sevel, Louis Coulon, Manuela Pineros-Rodriguez, Marin Bonnassies, Pierre Hellich, Puck van Gerwen, Sankalp Gambhir, Solal Pirelli, Thomas Blanchard, Timothée Callens, Toni Abi Aoun, Yannick Calvino Alonso, Yuri Cho, Alberto Chiappa, Antonio Sclocchi, Étienne Bruno, Florian Hofhammer, Gabriel Pescia, Geovani Rizk, Leello Dadi, Lucas Stoffl, Manoel Horta Ribeiro, Matthieu Bovel, Yueyang Pan, Aleksandra Radenovic, Alexandre Alahi, Alexander Mathis, Anne-Florence Bitbol, Boi Faltings, Cécile Hébert, Devis Tuia, François Maréchal, George Candea, Giuseppe Carleo, Jean-Cédric Chappelier, Nicolas Flammarion, Jean-Marie Fürbringer, Jean-Philippe Pellet, Karl Aberer, Lenka Zdeborová, Marcel Salathé, Martin Jaggi, Martin Rajman, Mathias Payer, Matthieu Wyart, Michael Gastpar, Michele Ceriotti, Ola Svensson, Olivier Lévêque, Paolo Ienne, Rachid Guerraoui, Robert West, Sanidhya Kashyap, Valerio Piazza, Viesturs Simanis, Viktor Kuncak, Volkan Cevher, Philippe Schwaller, Sacha Friedli, Patrick Jermann, Tanja Käser, Antoine Bosselut
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai