There are a variety of descriptions of GenAI and they shift depending on context. The following provides a brief overview:
AI is underpinned by algorithms and models. GenAI builds on new models and increasing computational power.
Some GenAI, such as ChatGPT, Claude, Copilot and Gemini, involve Large Language Models (LLMs) that have been trained on vast amounts of textual data. They use patterns to predict what words might follow in a sequence, using the content of large datasets to generate new sentences and paragraphs. These AI are based on Transformer models that learn from patterns in data and then use context and assigning different values to different inputs to enable prediction of likely next items. T is for transformer in GPT (Generative Pre-trained Transformer).
For more detail, watch Simon Angus’ presentation on Large Language Models and ChatGPT.
Think of it this way: if the current auto-completion on our smartphones can predict the next word and auto-completion in our email client can predict a sentence, then LLMs can predict words to fill paragraphs and pages.
The capacities of GenAI are shaped by the amounts and kinds of data that they are trained on. Therefore it is important to be cognisant of the data sources and potential biases within the particular training data used.
Many interactions with LLMs continue to expand the training data and capacities of the models. Therefore, be careful to check the data and privacy provisions of the AI tools that you are using and be very cautious not to provide any personal data when using such systems.
Warwick recommends the use of MS Co Pilot which protects material uploaded and inputted and does not use data for training.
LLMs with broad knowledge of language can produce seemingly coherent sentences and information. To date, the patterns that the neural network machine learning has learned are based on language syntax (how we put words together – nouns, verbs, adjectives etc.) and with some understanding of pragmatic tone and style but with very limited understanding of semantic meaning.
Therefore, a LLM can produce grammatically correct sentences that are not necessarily accurate or unbiased and can recombine accurate information to generate inaccurate content. False information is currently referred to as a “hallucination” and includes making up names of people, facts or events that have no proof of existence. Although the interaction with a chatbot interface LLMs may seem like knowledgeable human communication, the content generated by the AI tool needs to be carefully evaluated and critically considered.
LLM interfaces are primarily text-based, like a dialogue or a chat. Making a request for something is called prompting. Formulating the question/request and the precise syntax of the request guides the responses/outputs, which can be iteratively revised through further prompting. Like any conversation or collaboration, we have to frame, phrase and rephrase things to negotiate the outcomes with our partner(s).
In addition to using text prompts to generate more text (prose, poetry, computer code etc.), text prompts can also be used to generate images, sounds, videos, 3D-objects, even virtual worlds. Midjourney and DALL-E are well-known examples of GenAI that produce images. There are text to video generators such as Runway, Sora and Hour one. There are audio generators such as Elevenlabs or transcribers such as Otter.ai, to name a few. Interfaces are also evolving with the ability to upload reference files. Unlike transformer models that predominate LLM and text generation, diffusion AI models are typically used for creating images by diffusing pixels from random positions until they are distributed in a way that forms a coherent image that matches the interpretation of a prompt.
Collaborating with machine intelligences to generate a range of media expands creative possibilities but it is important to use these technologies responsibly (with critical awareness of the data, privacy, biases, limitations, inaccuracies etc.) and to acknowledge the contributions of AI partners in the collaborative production of work.