Chat2VIS: Fine-Tuning Data Visualisations using Multilingual Natural Language Text and Pre-Trained Large Language Models
Project Overview
The document explores the transformative role of generative AI in education, particularly focusing on Chat2VIS, a natural language interface that leverages large language models (LLMs) such as GPT-3, Codex, and ChatGPT to create data visualizations from natural language requests. It emphasizes the system's capability to interpret multilingual queries and iteratively refine visualizations based on user prompts, thus enhancing accessibility and usability for diverse learners. The paper also discusses the challenges of setting benchmarks for evaluating natural language to visualization (NL2VIS) systems and presents quantitative comparisons of Chat2VIS's performance against existing standards, illustrating its efficacy in educational contexts. Overall, the findings suggest that generative AI tools like Chat2VIS can significantly improve the way students and educators interact with data, fostering a more intuitive learning environment and enabling deeper engagement with complex information through visual representation.
Key Applications
Chat2VIS
Context: Generates data visualizations from natural language queries for users with varying technical skills.
Implementation: Chat2VIS uses LLMs to parse and execute natural language requests to create visualizations.
Outcomes: Enables users to create visualizations without programming knowledge, supports multilingual input, and allows for iterative refinements.
Challenges: Performance accuracy varies based on the clarity of the natural language input and the capability of the LLMs.
Implementation Barriers
Technical Limitations
The accuracy of visualizations depends on the LLM's ability to interpret natural language queries effectively.
Proposed Solutions: Improving the training datasets and refining prompt engineering techniques to enhance understanding.
Benchmarking Challenges
Existing benchmarks for evaluating NL2VIS systems are limited and may not accurately reflect the capabilities of different systems.
Proposed Solutions: Developing comprehensive methodologies for benchmarking that account for the diverse nature of visualizations.
Project Team
Paula Maddigan
Researcher
Teo Susnjak
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Paula Maddigan, Teo Susnjak
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai