Skip to main content Skip to navigation

Empirical Study of Symmetrical Reasoning in Conversational Chatbots

Project Overview

The document explores the application of generative AI, specifically conversational chatbots powered by large language models (LLMs), in the educational context, focusing on their ability to understand and evaluate predicate symmetry, a cognitive linguistic function. By employing in-context learning (ICL), the study assesses the reasoning skills of various AI chatbots, including ChatGPT 4, Huggingface chat AI, Microsoft's Copilot AI, LLaMA through Perplexity, and Gemini Advanced, using the Symmetry Inference Sentence (SIS) dataset. The findings reveal that while some chatbots demonstrate reasoning capabilities approaching those of humans, performance varies across different models, highlighting both the potential for integrating LLMs into educational settings and the challenges that remain in achieving consistent understanding of complex linguistic concepts. This study underscores the promise of generative AI in enhancing educational tools and methodologies, while also pointing to the need for further research and refinement to fully realize their effectiveness in learning environments.

Key Applications

Conversational chatbots for evaluating predicate symmetry

Context: Educational research and development, targeting researchers and educators in linguistics and AI.

Implementation: Chatbots were prompted with pairs of sentences to evaluate symmetry using a specifically designed scoring system.

Outcomes: Some chatbots, particularly Gemini, performed competitively with human evaluators, showing potential for ICL in understanding linguistic symmetry.

Challenges: Variations in performance among chatbots indicate an uneven grasp of linguistic nuances and symmetry.

Implementation Barriers

Technological

The stochastic nature of chatbots leads to variability in responses, which may affect reliability. Additionally, chatbots may struggle with nuanced linguistic properties and context-dependent meanings.

Proposed Solutions: Further research is needed to understand the randomness in responses and improve consistency. Development of training methods that enhance understanding of linguistic features and context.

Project Team

Daniela N. Rim

Researcher

Heeyoul Choi

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Daniela N. Rim, Heeyoul Choi

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies