Cookie Consent by Free Privacy Policy Generator

OpenAI introduces GPT-4o image generation

The new image generation function, which is directly integrated into the GPT-4o model, enables users to creatively and precisely generate images from text descriptions.

A candid paparazzi-style photo of Karl Marx hurriedly walking through the parking lot of the Mall of America,
Image generated with GPT4o from OpenAI (source: openai.com)

OpenAI has introduced a ground-breaking technology with the launch of GPT-4o. The new image generation function, which is directly integrated into the GPT-4o model, enables users to creatively and precisely generate images from text descriptions. Nothing new at first glance. So where is the innovation? While previous models have struggled to accurately represent details, graphics or text, 4o is able to display error-free text in images or graphics.

At OpenAI, we have long believed image generation should be a primary capability of our language models. That's why we've built our most advanced image generator yet into GPT-4o. The result-image generation that is not only beautiful, but useful. ~ OpenAI

According to OpenAI sources and media reports, GPT-4o offers the ability to create AND edit images directly from conversational input. Here are the key features:

  1. Precise text creation: The system can not only generate images, but also accurately render text within those images.
  2. Interactive image editing: Users can make specific changes in dialog with the AI, creating a dynamic adjustment process.
  3. Complex prompts: GPT-4o can handle up to 20 different objects in a single image, making it a powerful tool for designers and marketers.
  4. Image references: Users can upload existing images to serve as inspiration or the basis for new creations.
Make me a professionally shot photorealistic diagram of the top selling cocktails in my bar with recipes labeled on each drink. put the recipes on handwritten cards in front of each drink.the cards are brown, and the text is black. background is white. Title is "4 most popular cocktails"
A professionally shot photorealistic diagram of the top selling cocktails in my bar with recipes labeled on each drink. put the recipes on handwritten cards in front of each drink.
image generated by GPT 4o with text (source: OpenAI)Image image gBi

There are, of course, technical limitations. OpenAI has already announced that the model has difficulties with the consistent rendering of non-Latin characters. In addition, OpenAI is still having problems with the rollout of the feature, which is why it is only available to Pro users so far. To summarize, the introduction of the tool represents a significant step forward in AI-supported image generation. It combines the technologies of text creation and image generation in a single, user-friendly tool. The question remains as to how this technology will be used in the future and what guidelines are needed to ensure responsible development.


Sources:

https://openai.com/index/introducing-4o-image-generation/

OpenAI Rolls Out GPT-4o Image Creation To Everyone
OpenAI integrates free image generation into GPT-4o, creating context-aware visuals with text capabilities for all users.
'Insane': OpenAI introduces GPT-4o native image generation and it's already wowing users
As AI-generated images become more precise and accessible, GPT-4o represents a significant step forward in the space.