Skip to main content Skip to navigation

A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings

Project Overview

The document explores the application of generative AI, specifically large language models (LLMs), in educational contexts, particularly focusing on their use for academic guidance in resource-constrained environments. It highlights a cost-effective approach that employs Low-Rank Adaptation (LoRA) and quantization techniques to fine-tune LLMs, enabling them to deliver accurate and contextualized advising for students, especially those considering study-abroad opportunities. The study reports notable enhancements in performance metrics resulting from this adaptation, while also tackling challenges related to the generalizability of the models and the reliance on synthetic datasets for training. Overall, the findings underscore the potential of generative AI to improve educational support systems, making them more accessible and effective in diverse settings.

Key Applications

Fine-tuning of Mistral-7B-Instruct for academic advising

Context: Educational guidance for students looking to study abroad, particularly in low-resource settings

Implementation: Utilized a two-phase fine-tuning method with synthetic and real-world datasets, employing LoRA for parameter-efficient adaptation and 4-bit quantization for memory efficiency.

Outcomes: Achieved a 52.7% reduction in training loss, 92% accuracy in domain-specific recommendations, and 95% adherence to markdown formatting, demonstrating successful deployment in limited-resource environments.

Challenges: Challenges included decreased generalizability due to reliance on a synthetic dataset and potential issues with the model's performance on vague or incomplete queries.

Implementation Barriers

Technical Barrier

Operational complexities related to energy consumption, model resource sizes, and response time latencies in low-resource environments.

Proposed Solutions: Implementing parameter-efficient strategies like LoRA and 4-bit quantization to reduce resource requirements.

Content Barrier

The synthetic dataset may not reflect the complexity and variability of actual user interactions, limiting the model's effectiveness.

Proposed Solutions: Future work includes plans for integrating real-time academic databases and more diverse datasets for better generalization.

Project Team

Md Millat Hosen

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Md Millat Hosen

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies