GPT-4

TechTarget.com/whatis

https://www.techtarget.com/whatis/definition/GPT-4

GPT-4

By Ben Lutkevich

What is GPT-4?

GPT-4 is OpenAI's large multimodal language model that generates text from textual and visual input. Open AI is the American AI research company behind Dall-E, ChatGPT and GPT-4's predecessor GPT-3.

GPT-4 can handle more complex tasks than previous GPT models. The model exhibits human-level performance on many professional and academic benchmarks, including the Uniform Bar Exam. It was developed to improve alignment and scalability for large models of its kind.

What does GPT-4 stand for?

GPT-4 stands for Generative Pre-Trained Transformer 4.

GPTs are machine learning algorithms that respond to input with human-like text. They have the following characteristics:

Generative. They generate new information.
Pre-trained. They first go through an unsupervised pre-training period using a large corpus of data. Then they go through a supervised fine-tuning period to guide the model. Models can be fine-tuned to specific tasks.
Transformers. They use a deep learning model -- transformers -- that learns context by tracking relationships in sequential data. Specifically, GPTs track words or tokens in a sentence and predict the next word or token.

What are Generative Pre-trained Transformers?

GPTs were introduced by OpenAI in a 2018 paper titled "Improving Language Understanding by Generative Pre-Training." This paper described GPT's semi-supervised learning model, which contrasted against other natural language processing models that used supervised learning and labeled data.

GPT processing power scales with the number of parameters the model has. Each new GPT model has more parameters than the previous one. GPT-1 has 0.12 billion parameters and GPT-2 has 1.5 billion parameters, whereas GPT-3 has more than 175 billion parameters. The exact number of parameters in GPT-4 is unknown but is rumored to be more than 1 trillion parameters.

What's new in GPT-4?

GPT is the first large multimodal model of its kind. It is sometimes referred to as a next-gen model. GPT-4 Vision can turn image inputs into text.

In fall 2023, OpenAI rolled out GPT-4 Turbo, which provides answers with context up to April 2023. The previous knowledge cutoff for GPT-4 was January 2022. The release also increased the model’s context window and decreased pricing for developers. Developers with an OpenAI API account can access GPT-4 Turbo.

In May 2024, OpenAI introduced GPT-4 Omni (GPT-4o) with improvements including faster response times and advanced multimodal capabilities to recognize audio, image and text. Users can engage in real-time conversations with ChatGPT, and the GPT-4o can recognize screens and photos and ask questions about them while conversing with the user. The GPT-4o model will be available on consumer and developer products and will be free to all users.

GPT-4 training and capabilities

Open AI has released relatively little information about the technical specifications of GPT-4. There is little information about the data used to train the system, the model size, the energy costs of the system, the hardware it runs on or the methods used to create it. OpenAI acknowledged this in the GPT-4 technical paper, which said they wouldn't release this information because of safety reasons and the highly competitive market. OpenAI did acknowledge that GPT-4 was trained on both publicly available data and data licensed from third parties.

GPT-4 -- and other GPTs -- are trained using reinforcement learning from human feedback. Models are rewarded for desired behavior or when they follow a set of rules. GPT-4 gets an extra safety reward during training to reduce harmful outputs. OpenAI tested GPT-4's accuracy on adversarial questions with the help of constitutional AI company Anthropic. A few example rules from Anthropic's constitution include the following:

Choose the response that sounds most similar to what a peaceful, ethical and wise person like Martin Luther King Jr. or Mahatma Ghandi might say.
Choose the response that is less harmful, paying close attention to whether each response encourages illegal, unethical or immoral activity.

More on AI ethics

As powerful generative AI models like OpenAI's are released to the public and companies restructure around them, conversations have emerged about AI alignment, ethics and regulation. Here are some other stories to read:

Reasons for and effects of Microsoft cutting AI ethics unit

Federal report focuses on AI diversity and ethics

Implications of AI art lawsuits for copyright laws

The accelerating use of generative AI may prompt U.S. action

Ex-Google engineer Blake Lemoine discusses sentient AI

5 AI risks businesses must confront and how to address them

OpenAI has released several demos that show GPT-4's capabilities. Some specific notable capabilities include the following:

Passing academic tests with a high degree of accuracy. GPT-4 scores higher on advanced exams like the Uniform Bar (90^th percentile), the LSAT (88^thpercentile), the Math SAT (89^th percentile) and the GRE Quantitative exam (80^th percentile).
Finding a common theme between two articles. The user could paste two articles into the prompt and ask the model to give a summary of the common themes between them.
Using it as a programming and debugging assistant. Users can prompt the model to program in pseudocode, then write the code for a Discord bot, for example. If an error appears, users can paste the error message in the prompt, and the model will correct the code.
Describing a picture in vivid detail. Users could prompt the model with a screenshot of their browser window, and it would describe everything it sees.
Accurately identifying what is funny about an image. The model can analyze an image and identify the incongruities that make it funny. Humor in AI has been traditionally difficult to solve.
Coding a website from an image of the outline. The user could draw up a basic website layout by hand -- with barely legible handwriting -- upload a photo of it as a prompt, and the model can code a website with JavaScript and HTML based on the rudimentary picture the user presented.
Doing taxes using tax code and spelling out the reasoning behind it. Users can instruct the model to read and apply tax code and then prompt it with a problem asking for someone's standard tax deduction based on details about their life.
Re-interpreting tax code or a blog post as a rhyming poem. After solving the deduction problem, the user could prompt the model to turn all the work it showed in solving the problem into a poem.
Handle complex and challenging language in a legal document consistently. The model could perform document review, draft legal research memos, prepare for depositions and analyze contracts.

Like any language model, GPT-4 still hallucinates information, gives wrong answers and produces buggy code in some instances. It may also still be susceptible to racial and gender bias.

GPT-4 vs. GPT-3

GPT-3 is large language model, which means it performs language processing exclusively. GPT-4 is a large multimodal model that can process image and text inputs. OpenAI emphasizes the goal of GPT-4 was to scale up deep learning.

Some other ways the two models differ include the following:

GPT-4 passes different performance checkpoints from OpenAI. It outperforms other models in English, and far outperforms it in other languages.
GPT-4 can handle longer prompts than GPT-3. Specifically, it can analyze, read and generate up to 25,000 words.
GPT-4 is significantly better than GPT-3 at processing programming instructions.
GPT-4 is also highly steerable. Where GPT-3 would respond in a uniform tone and style, users can tell GPT-4 how they would like it to respond with explicit instructions. This can help with framing prompt and improve prompt engineering. Users can customize the model's behavior using a separate system message. GPT-4's steerability improves over time.
GPT-4 is trained to limit the possibility of harmful responses and refuse to respond to requests for disallowed content. For example, GPT-4 was trained to refuse queries about synthesizing dangerous chemicals and answered questions about buying cigarettes without encouraging smoking.
GPT-4 is better at basic mathematics than GPT-3.

When was GPT-4 released?

GPT-4 was released March 14, 2023. In an ABC news interview days after its release, OpenAI CEO Sam Altman said, "We've got to be cautious here, and also, it doesn't work to do all of this in a lab. You've got to get all of these products out into the world and make contact with reality, make our mistakes while the stakes are low. All of that said, I think people should be happy that we're a little bit scared of this."

The newest version of GPT-4 -- GPT-4o -- was announced in May 2024.

When can I use GPT-4?

There are two main ways of accessing GPT-4 as of this writing:

ChatGPT Plus. A paid subscription to ChatGPT Plus earns users access to GPT-4. ChatGPT Plus users can send up 50 messages every three hours to GPT-4.
Bing. GPT-4 also powers the Bing search engine integrated chatbot, which Microsoft co-developed. Bing's chatbot has a usage cap and allows for image uploads.

Developers can also use the API on a pay-per-use basis.

Users can also evaluate the model. Open AI CEO Sam Altman tweeted on March 14, 2023 that the company is open sourcing an evaluation framework.

Will GPT-4 be free?

GPT-4 was not free. However, with the new GPT-4o model, OpenAI announced it will be free to ChatGPT users, so no subscription is required for ChatGPT Plus. Other features included in the original subscription to GPT-4 -- such as memory and web browsing -- are also free to consumers. There is a fee for developers to use the API of $5 per 1 million tokens for input and $15 per 1 million tokens for output.

GPT-4 was a milestone AI release that came at the beginning of 2023. Check out these 10 AI trends to prepare for what the rest of the year might bring.

14 May 2024