SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation
Project Overview
The document discusses the Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG) framework, which leverages a large language model to enhance the solving of math word problems through schema-based instruction that categorizes problems for clearer reasoning. This innovative approach shows significant improvements in performance compared to existing models, such as GPT-4 and GPT-3.5 Turbo, particularly in structured problem-solving and the quality of reasoning. However, the framework faces challenges, including its dependency on the relevance of retrieved documents and the necessity for human evaluation to effectively validate its performance. Overall, the document highlights the potential of generative AI in education, particularly in mathematics, by illustrating how it can improve educational outcomes and reasoning skills in problem-solving contexts.
Key Applications
Schema-Based Instruction Retrieval-Augmented Generation (SBI-RAG)
Context: Math word problems in educational settings, aimed at middle school students
Implementation: Utilizes a schema classifier trained on a custom dataset to predict schemas for word problems, then generates schema-specific prompts and retrieves relevant context for LLMs to generate structured solutions.
Outcomes: Enhanced reasoning clarity, improved structured problem-solving process, higher reasoning scores compared to GPT-4 and GPT-3.5 Turbo.
Challenges: Dependence on the relevance and quality of retrieved documents, lack of direct human evaluation, and potential limitations in generalizability across different subjects and educational levels.
Implementation Barriers
Technical Barrier
The effectiveness of the SBI-RAG framework relies heavily on the quality and relevance of the documents retrieved for context, which can vary.
Proposed Solutions: Future work includes refining the document retrieval process and ensuring that the context provided is high-quality and relevant.
Human Evaluation Barrier
The current evaluation method relies on LLMs for judging reasoning quality, lacking direct feedback from educators or students.
Proposed Solutions: Incorporating human evaluations to gather more informative feedback and enhance the system's adaptability.
Generalizability Barrier
The current framework has primarily been evaluated on arithmetic word problems, raising questions about its applicability to more complex problems.
Proposed Solutions: Extending the framework to include a wider variety of problem types and educational contexts.
Project Team
Prakhar Dixit
Researcher
Tim Oates
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Prakhar Dixit, Tim Oates
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai