ARTICLE AD
In a Reddit AMA, OpenAI CEO Sam Altman admitted that a lack of compute capacity is one major factor preventing the company from shipping products as often as it’d like.
“All of these models have gotten quite complex,” he wrote in response to a question about why OpenAI’s next AI models were taking so long. “We also face a lot of limitations and hard decisions about we allocated our compute towards many great ideas.”
Many reports suggest that OpenAI has struggled to secure enough compute infrastructure to run and train its generative models. Just this week, Reuters, citing sources, said that OpenAI has for months been working with Broadcom to create an AI chip for running models, which could arrive as soon as 2026.
Partly as a result of strained capacity, Altman said, OpenAI’s realistic-sounding conversational feature for ChatGPT, Advanced Voice Mode, won’t be getting the vision capabilities first teased in April anytime soon. At its April press event, OpenAI showed the ChatGPT app running on a smartphone and responding to things, such as the clothes someone was wearing, within view of the phone’s camera.
Reporting later revealed the demo was rushed to steal attention away from Google’s I/O developer conference, which was taking place the same week. Many within OpenAI didn’t think GPT-4o was ready to be revealed — tellingly, the voice-only version of Advanced Voice Mode was delayed for months.
In the AMA, Altman indicated that the next major release of OpenAI’s image generator, DALL-E, has no launch timeline. (“We don’t have a release plan yet,” he said.) Meanwhile, Sora, OpenAI’s video-generating tool, has been held back by the “need to perfect the model, get safety/impersonation/other things right, and scale compute,” wrote Kevin Weil, OpenAI’s chief product officer, who also participated in the AMA.
Sora has reportedly suffered from technical setbacks that position it poorly against rival systems from Luma, Runway, and others. Per The Information, the original system, revealed in February, took more than 10 minutes of processing time to make a 1-minute video clip.
In October, one of the co-leads on Sora, Tim Brooks, left for Google.
Later in the AMA, Altman said that OpenAI’s still considering allowing “NSWF” content in ChatGPT “someday” (“we totally believe in treating adult users like adults,” he wrote), and that the company’s top priority is improving its o1 series of “reasoning” models and their successors. OpenAI previewed a number of features coming to o1 at its DevDay conference in London this week, including image understanding.
“We have some very good releases coming later this year,” Altman wrote. “Nothing that we are going to call GPT-5, though.”