Formal Report
- Home
- 1.Formal Report
- 1.1 Introduction to Project
- 1.2 The Emergence of ChatGPT and Limitations of GPT-3.5
- 1.3 Understanding LLMs and Evolution of AI Models
- 1.4 Extending LLM Capabilities and Introduction of ChatGPT o1
- 1.5 A Step Change in AI Capabilities and Key Findings
- 1.6 Performance of AI Models and Urgency for Institutional Action
- 1.7 Recognising the Problem and Specific Regulations
- 1.8 Recommendations and Conclusion
- 2. Student Conversations
- 3. How ChatGPT Performed on University-Level Work
- 4. Suggested Changes and Future Direction of Regulations
- 4.1 Developing Clear Policies on AI Use
- 4.2 Enhancing Student Support and Guidance
- 4.3 Emphasising Skills That AI Cannot Replicate
- 4.4 Adapting Pedagogy and Innovating Assessments
- 4.5 Encouraging Collaborative Solutions Among Stakeholders
- 4.6 Allocating Resources for Training and Support
- 4.7 Adopting Alternative Assessment Methods
- 4.8 Relying on Honour Codes and Academic Integrity Pledges
- 4.9 Designing AI-Resistant Assignments
- 4.10 Using AI Detection Software
- 4.11 Implementing Oral Examinations (VIVAs)
- 5 Opportunities AI Presents
- 6 Tips For Markers on Spotting Potential AI Usage
Exploring the Use of AI in Mathematics and Statistics Assessments: At a Glance
1. Introduction to the Project
The project explores the impact of advanced AI technologies, specifically Large Language Models (LLMs) like GPT-4o, on mathematics and statistics assessments in higher education. The project aims to understand how AI can assist in education while maintaining academic integrity. Read more...
2. ChatGPT and Limitations
The launch of ChatGPT in November 2022 significantly transformed academic environments, enabling students to more easily incorporate AI into their assignments. However, this shift also raised concerns about academic integrity and fostered a general distrust of AI models like GPT-3.5, particularly due to its unreliability in solving advanced mathematical problems. Read more...
3. LLMs and Evolution
LLMs, such as GPT-3.5 and GPT-4, use pattern prediction rather than mathematical computation, leading to errors in complex tasks. Over time, scaling and algorithmic advancements, including GPT-4o’s release, have improved their capabilities. Read more...
4. Capabilities and System 2 Thinking
GPT-o1 marked a significant advancement by integrating “System 2” thinking, enabling models to reason through complex problems before responding. This offers improved performance for more difficult tasks like advanced mathematics and coding. Read more...
5. A Step Change in AI Capabilities and Key Findings
The introduction of ChatGPT o1 represents a pivotal moment in AI development, especially for mathematical and technical tasks. By incorporating advanced reasoning capabilities, o1 overcomes limitations of prior models, enabling complex, multi-step reasoning processes. This shift necessitates further re-evaluation of assessment methods to maintain academic integrity while leveraging AI's benefits.
The study revealed significant findings regarding students' understanding of AI capabilities and ethical concerns. While many students use AI tools, there is widespread apprehension about academic integrity. The findings highlight the need for clear guidelines, education on ethical AI use, and open dialogue within academic institutions. Read more...
6. Performance, Urgency, and Action
AI integration requires universities to re-evaluate assessment methods. Traditional exams may not be enough to prevent misuse. Institutions need to shift from avoidance to thoughtful incorporation of AI in learning environments, promoting AI literacy and ethical awareness. Read more...
7. The Problem and Regulations
Institutions must recognise AI’s capabilities and develop clear policies. It's essential to implement guidelines around AI use, define what is permissible, and educate students and faculty on ethical AI practices. A collaborative approach is necessary to address the challenges effectively. Read more...
8. Recommendations and Conclusion
The report provides recommendations around AI in education, focusing on three core principles: verification, transparency, and ownership. Clear regulations and proactive measures can help balance AI's benefits with maintaining academic integrity. Read more...
References
- OpenAI. (2022). Introducing ChatGPT. Retrieved from https://openai.com/index/chatgpt/Link opens in a new window
- Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2023). A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. arXiv. https://doi.org/10.48550/arXiv.2311.05232Link opens in a new window
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., & Polosukhin, I. (2017). Attention Is All You Need. arXiv. https://arxiv.org/abs/1706.03762Link opens in a new window
- OpenAI. (2023). GPT-4: OpenAI’s Most Advanced System. Retrieved from https://openai.com/index/gpt-4/Link opens in a new window
- OpenAI. (2024). New Models and Developer Products Announced at DevDay. Retrieved from https://openai.com/index/new-models-and-developer-products-announced-at-devday/Link opens in a new window
- OpenAI. (2024). Hello GPT-4o: Our New Flagship Model. Retrieved from https://openai.com/index/hello-gpt-4o/Link opens in a new window
- Meta AI. (2024). Introducing Llama 3.1: Our Most Capable Models to Date. Retrieved from https://ai.meta.com/blog/meta-llama-3-1/Link opens in a new window
- OpenAI. (2024). Code Interpreter Beta. Retrieved from https://platform.openai.com/docs/assistants/tools/code-interpreterLink opens in a new window
- Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, M., & Wang, H. (2023). Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv. https://arxiv.org/abs/2312.10997Link opens in a new window
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv. https://arxiv.org/abs/2201.11903Link opens in a new window
- OpenAI. (2024). Introducing OpenAI o1-preview: A New Series of Reasoning Models for Solving Hard Problems. Retrieved from https://openai.com/index/introducing-openai-o1-preview/Link opens in a new window
- Tao, T. (2024). Experiments with GPT-o1. Retrieved from https://mathstodon.xyz/@tao/113132502735585408Link opens in a new window
- Altman, S. (2024). The Intelligence Age. Retrieved from https://ia.samaltman.com/Link opens in a new window