Chat2VIS: Fine-Tuning Data Visualisations using Multilingual Natural Language Text and Pre-Trained Large Language Models

Project Overview

The document explores the transformative role of generative AI in education, particularly focusing on Chat2VIS, a natural language interface that leverages large language models (LLMs) such as GPT-3, Codex, and ChatGPT to create data visualizations from natural language requests. It emphasizes the system's capability to interpret multilingual queries and iteratively refine visualizations based on user prompts, thus enhancing accessibility and usability for diverse learners. The paper also discusses the challenges of setting benchmarks for evaluating natural language to visualization (NL2VIS) systems and presents quantitative comparisons of Chat2VIS's performance against existing standards, illustrating its efficacy in educational contexts. Overall, the findings suggest that generative AI tools like Chat2VIS can significantly improve the way students and educators interact with data, fostering a more intuitive learning environment and enabling deeper engagement with complex information through visual representation.

Key Applications

Chat2VIS

Context: Generates data visualizations from natural language queries for users with varying technical skills.

Implementation: Chat2VIS uses LLMs to parse and execute natural language requests to create visualizations.

Outcomes: Enables users to create visualizations without programming knowledge, supports multilingual input, and allows for iterative refinements.

Challenges: Performance accuracy varies based on the clarity of the natural language input and the capability of the LLMs.

Implementation Barriers

Technical Limitations

The accuracy of visualizations depends on the LLM's ability to interpret natural language queries effectively.

Proposed Solutions: Improving the training datasets and refining prompt engineering techniques to enhance understanding.

Benchmarking Challenges

Existing benchmarks for evaluating NL2VIS systems are limited and may not accurately reflect the capabilities of different systems.

Proposed Solutions: Developing comprehensive methodologies for benchmarking that account for the diverse nature of visualizations.

Project Team

Paula Maddigan

Researcher

Teo Susnjak

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Paula Maddigan, Teo Susnjak

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects