The world was still discovering the wonders of Open AI’s ChatGPT (Chat Generative Pre-trained Transformer) 3.5 when the artificial intelligence company dropped its successor touting it as “the latest milestone in OpenAI’s effort in scaling up deep learning.”
GPT-4 has the ability to generate text as well as process image and text inputs, which is a notable advancement over its predecessor. It has been reported that GPT-4 has achieved "human level" performance in various professional and academic evaluations.
“GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks,” Open AI stated.
Open AI claims that GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.
The company completely redesigned its deep learning stack to reach this level of language processing capabilities, and in collaboration with Azure, co-designed a supercomputer from the ground to accommodate its workload.
Let’s dive deep and understand what makes the latest version of Chat GPT a notch above its predecessor and what are the limitations of the AI-based chatbot.
What Are The Features of ChatGPT-4?
A major boost that the latest version has is its ability to decipher image and text inputs to generate human-like responses aiding us with a number of projects.
For instance, if a user sends a picture of the inside of a cupboard, GPT-4 will not only recognize the different clothes available but also suggest a list of pairings in which the clothes can be worn.
The company has illustrated the image-processing prowess of GPT-4 on its website.
“GPT-4 can accept a prompt of text and images, which—parallel to the text-only setting—lets the user specify any vision or language task. Specifically, it generates text outputs (natural language, code, etc.) given inputs consisting of interspersed text and images. Over a range of domains—including documents with text and photographs, diagrams, or screenshots—GPT-4 exhibits similar capabilities as it does on text-only inputs,” the company stated.
However, it also mentioned that the image inputs are still a research preview and not publicly available. Most AI chatbots can understand and generate responses in English but Chat GPT-4 has broken the language barrier, having the ability to produce results in 26 languages.
“Many existing ML benchmarks are written in English. To get an initial sense of capability in other languages, we translated the MMLU benchmark—a suite of 14,000 multiple-choice problems spanning 57 subjects—into a variety of languages using Azure Translate. In the 24 of 26 languages tested, GPT-4 outperforms the English-language performance of GPT-3.5 and other LLMs (Chinchilla, PaLM), including for low-resource languages such as Latvian, Welsh and Swahili,” the company stated.
The company mentioned a feature called Steerability which roughly translates to Chat GPT’s ability to adhere to and behave as per the user’s commands and directions.
“Rather than the classic ChatGPT personality with a fixed verbosity, tone, and style, developers (and soon ChatGPT users) can now prescribe their AI’s style and task by describing those directions in the “system” message. System messages allow API users to significantly customize their users’ experience within bounds,” Open AI said.
What Are The Limitations Of ChatGPT-4?
Although the advanced version does boast a plethora of enhanced features there are certain limitations that still need to be worked upon.
Hallucinations – Despite its capabilities, GPT-4 has similar limitations as earlier GPT models. Most importantly, it still is not fully reliable (it “hallucinates” facts and makes reasoning errors). While still a real issue, GPT-4 significantly reduces hallucinations relative to previous models (which have themselves been improving with each iteration). GPT-4 scores 40% higher than our latest GPT-3.5 on our internal adversarial factuality evaluations:
- Lack of knowledge from current events - GPT-4 generally lacks knowledge of events that have occurred after the vast majority of its data cut off (September 2021).
- Users might be excited to get their hands on the latest version of Chat GPT but it is only available on the API platform with a waitlist on Chat GPT plus whose services attract monetary expense.
- GPT-4 with an 8K context window (about 13 pages of text) will cost $0.03 per 1K prompt tokens and $0.06 per 1K completion tokens.
- GPT-4-32k with a 32K context window (about 52 pages of text) will cost $0.06 per 1K prompt tokens and $0.12 per 1K completion tokens.
Despite all the benefits of the latest version of Chat GPT, the Founder Sam Altman expressed that GPT-4 "is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.”