ChatGPT vs. GPT: How are they different?
Although the terms ChatGPT and GPT are both used to talk about generative pre-trained transformers, there are significant technical differences to consider.
ChatGPT and GPT are both natural language processing tools introduced by OpenAI, but they differ in their technological capabilities and pricing. Making matters more complicated, the term GPT is also being used to refer to any product that uses any kind of generative pre-trained transformers, not just the versions that come from OpenAI. Here, we explore the differences between GPT and ChatGPT. But first, some background on their development.
History of GPT and ChatGPT
OpenAI revealed the first generative pre-trained transformer (GPT) in 2018. More capable versions followed quickly, including GPT-2 in 2019, GPT-3 in 2020 and GPT-4 in 2023. Newer versions are only accessible via an API, while GPT-2 is available as open source software.
OpenAI introduced ChatGPT as a consumer-facing service in late 2022, which attracted over 50 million users in a month, making it one of the fastest-growing consumer services in history.
"While technological improvements have been advancing steadily for a few years now, OpenAI and the release of ChatGPT specifically made large language models widely available," said Donncha Carroll, a partner at Lotis Blue Consulting who leads the firm's data science center of excellence. "The real genius was providing a simple-to-use interface making the technology accessible to the average person."
This article is part of
What is generative AI? Everything you need to know
The first version of ChatGPT was built on GPT-3.5, although paying subscribers can also gain access to GPT-4 through the same ChatGPT interface. OpenAI said its Chat API can write emails and code, answer questions about documents, create conversational agents, tutor and translate. GPT can do many of these same things, including completing text, writing code, summarizing text and generating new content.
OpenAI claims that GPT-4 can answer questions more accurately than prior versions, as measured by scores on tests such as the SAT, LSAT and uniform bar exam. It's also more expensive to use than prior versions.
In the early days, OpenAI reported on the number of features in its GPT models as a proxy metric for capabilities. For example, GPT had 117 million parameters, GPT-2 had up to 1.5 billion parameters and GPT-3 had up to 175 billion parameters. However, bigger isn't always better. In fact, bigger might be slower and more expensive to run. OpenAI decided not to publicly report model feature sizes for GPT-3.5 or GPT-4 models. Also, OpenAI charges more for larger models.
There is also a wide range of GPT-based models and services for processing text, transcribing audio (Whisper) and generating images (Dall-E). Initially, ChatGPT only answered typed questions entered into a prompt. However, OpenAI is starting to add ChatGPT support for pictures as well. The API access to both tools opens new opportunities for enterprises to customize and enhance their offerings.
"Enabling API access will enable enterprises to more easily use ChatGPT in their own products and environments," said Lori Witzel, director of thought leadership at Tibco.
In addition, fine-tuning can let models learn more about business-specific domains.
What is the difference between ChatGPT and GPT?
Clarifying the buzzword bingo. The term ChatGPT is often used to describe the process of adding chat capabilities to a product. In its public-facing communications, for example, OpenAI describes a ChatGPT API that powers new services from Instagram, Snap, Quizlet and Instacart.
"The ChatGPT app for Slack combines the knowledge found in Slack with the intelligence of ChatGPT, making AI more accessible in a place we're already working," said Jackie Rocca, senior director of product at Slack. The app will be integrated into the natural flow of work and provide instant conversation summaries, research tools and writing assistance directly into Slack.
The term GPT is also being used as a generic term at the end of product names for advertising the introduction of new AI capabilities. These might refer to generative pre-trained transformers in the generic sense rather than to a specific model from OpenAI. Google first developed the transformer technology that underlies GPT but has been less consistent in branding its various implementations, including BERT, LaMDA and PaLM. Google's first ChatGPT-like implementation is called Bard and was only pushed to market after the fantastic success of ChatGPT.
Likewise, Microsoft has adopted the term copilot to signify the embedding of GPT-powered code completion capabilities into GitHub and task completion into its productivity tools.
ChatGPT vs. GPT: Unwinding the technical differences. The way OpenAI characterizes the technical differences between ChatGPT and GPT is also messy. There is no specific ChatGPT model, although OpenAI notes that GPT 3.5 Turbo and GPT 4 models power ChatGPT. Its pricing page breaks out different service categories for GPT-4, Chat (with only GPT-3.5 turbo listed), InstructGPT (various flavors of GPT-3), embedding models, image models and audio models.
There are technical differences between OpenAI's many models, some of which apply to ChatGPT and some that don't. First off, companies can only fine-tune GPT-3 models. This means developers can't customize the GPT-3.5 turbo and GPT-4 models to work more efficiently with their own data. Fine-tuning is the process of submitting combinations of queries and responses to improve the accuracy of questions likely to be encountered in the call center.
Another big difference is the age of the training data. GPT-4 and GPT-3.5 turbo were trained on data last updated in September 2021. Davinci, the most expensive and competent GPT-3 model, was last updated in June 2021, while other GPT-3 models were last refreshed in October 2019. This is important because the newer models might be more capable of understanding and responding to queries about current events.
Codex, a special model trained on code samples, is currently free while it's in beta. The same model is also surfaced through GitHub Copilot as a code completion service that starts at $10 per month for individuals.
The last big difference is that OpenAI introduced a new API query method for its ChatGPT-capable models, called ChatML. In the older API, an app sends a text string to the API. This also allows hackers to send malformed queries for various kinds of attacks. In the new ChatML, each message contains a header identifying the source of each string and the contents. Today, the contents are only text, but down the road, this could include other data such as images, 3D objects or audio.
Pricing for ChatGPT vs. GPT. There are also some noteworthy pricing differences between the services behind ChatGPT and GPT models. First, the ChatGPT service is offered across two tiers, including a free version and a paid one that costs $20 per month. The premium version adds improved performance and access to the more recent GPT-4 models as an option. For the moment, this paid service is trained on data that is as old as the free version.
In addition, the GPT-3.5 turbo model powering ChatGPT is one of the most cost-effective offerings, costing a tenth of the previous state-of-the-art model at $0.002 per thousand tokens. One thousand tokens are about 750 words. In contrast, Davinci, the most powerful GPT-3 model, costs $0.02 per thousand tokens, while Ada, the fastest, costs $0.0004 per thousand. GPT-4 charges for both prompts and completions that range from $0.03 per thousand tokens for shorter prompts to $0.12 per thousand for longer completions.
OpenAI also charges enterprises for fine-tuning models plus a higher pricing tier for using them. For example, Davinci costs $0.03 per thousand tokens for training, compared to $0.12 per thousand tokens for using them -- comparable to GPT-4. At the bottom end, Ada is only $0.0004 per thousand tokens for training and $0.0016 per thousand for use, making it slightly cheaper than the GPT-3.5 turbo used in ChatGPT.
Other GPT services include embedding for classifying words at a cost of $0.0004 per thousand tokens; embedding for image models at a cost of $0.02 for large images and embedding for transcription at a cost of $0.006 per minute.
What's the future of GPT?
The ChatGPT and GPT ecosystem of development tools and services continues to evolve. Software vendors are excited about the road ahead.
"We expect GPT and ChatGPT to keep getting smarter, more efficient and have fewer issues as more people embrace AI and the technology learns more," said Slack's Rocca.
Tony Jiang, engineering manager at Hippo Insurance, is excited about the growing support for voice. He said, "Even though today ChatGPT is accessible in the browser, commonly used as an internet search interface, it's likely a matter of time before voice interfaces are implemented on top of ChatGPT, making it significantly more accessible. This should also reduce user friction and speed up the AI-human interaction. This has the potential to greatly improve the learning experience for everybody, from kindergarteners to experienced professionals, but other important uses are likely to as ways to provide entertainment or even for psychological well-being."
But it's also essential to keep the evolution of these offerings in context.
"It will become ever more seemingly intelligent but still will not actually understand the prompt nor what it's saying," said Aaron Kalb, chief strategy officer and co-founder of Alation, a data management company.
These models are a bit like a person responding to questions in a social setting without thinking. More competent services might require building more sophisticated models of meaning informed by business-specific data to eliminate inaccuracies.