Skip to main content Skip to navigation

Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding

Project Overview

The document outlines the development of Knowledge AI, a deep learning framework that utilizes Large Language Models (LLMs) to enhance Natural Language Processing (NLP) capabilities specifically within the scientific field. By fine-tuning these models for various tasks—including summarization, text generation, question answering, and named entity recognition—the framework aims to make scientific knowledge more accessible to non-experts, thereby bridging the communication gap between researchers and the general public. The findings indicate that domain-specific fine-tuning substantially improves model performance, highlighting the efficacy of generative AI in educational contexts. Overall, Knowledge AI serves as a significant tool in democratizing scientific information, facilitating better understanding and engagement with complex topics for those outside the expert community.

Key Applications

Knowledge AI framework for scientific text processing

Context: Scientific research aimed at making complex findings accessible to a general audience.

Implementation: Fine-tuning pre-trained LLMs on specific scientific datasets for tasks like summarization, QA, text generation, and NER.

Outcomes: Improved model performance in summarization, text generation, QA, and NER, enabling non-experts to query and understand scientific content.

Challenges: Limited accessibility tools for scientific communication and the need for domain-specific adaptations.

Implementation Barriers

Technical Barrier

Challenges in effectively processing long scientific documents due to input token constraints of certain models.

Proposed Solutions: Utilization of models like LED (Longformer Encoder-Decoder) designed for longer documents.

Resource Barrier

High computational resource requirements for training and fine-tuning large models.

Proposed Solutions: Implementation of Parameter-Efficient Fine-Tuning (PEFT) to reduce resource usage.

Project Team

Balaji Muralidharan

Researcher

Hayden Beadles

Researcher

Reza Marzban

Researcher

Kalyan Sashank Mupparaju

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Balaji Muralidharan, Hayden Beadles, Reza Marzban, Kalyan Sashank Mupparaju

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies