Tech

OpenAI launches preview of Codex AI SWE agent for builders

May 16, 2025

998

[ad_1]

Join our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Learn More

Surprise! Just days after reviews emerged suggesting OpenAI was shopping for white-hot coding startup Windsurf, the previous firm seems to be launching its personal competitor service as a analysis preview underneath its model identify Codex, going head-to-head towards Windsurf, Cursor, and the rising record of AI coding instruments supplied by startups and enormous tech firms together with Microsoft and Amazon.

Unlike OpenAI’s earlier Codex code completion AI mannequin, the brand new model is a full cloud-based AI software program engineering (SWE) agent constructed atop a fine-tuned model of OpenAI’s o3 reasoning mannequin that may execute a number of improvement duties in parallel.

Starting immediately it is going to be accessible for ChatGPT Pro, Enterprise, and Team customers, with assist for Plus and Edu customers anticipated quickly.

Codex’s evolution: from mannequin to autonomous AI coding agent

This launch marks a major step ahead in Codex’s improvement. The unique Codex debuted in 2021 as a mannequin for translating pure language into code accessible by means of OpenAI’s nascent utility programming interface.

It was the engine behind GitHub Copilot, the favored autocomplete-style coding assistant designed to work inside IDEs like Visual Studio Code.

That preliminary iteration targeted on code technology and completion, skilled on billions of strains of public supply code.

However, the early model got here with limitations. It was vulnerable to syntactic errors, insecure code strategies, and biases embedded in its coaching knowledge. Codex sometimes proposed superficially appropriate code that failed functionally, and in some circumstances, made problematic associations primarily based on prompts.

Despite these flaws, it confirmed sufficient promise to ascertain AI coding instruments as a quickly rising product class. That unique mannequin has since been deprecated and became the identify of a brand new suite of merchandise, in keeping with an OpenAI spokesperson.

GitHub Copilot formally transitioned off OpenAI’s Codex mannequin in March 2023, adopting GPT-4 as a part of its Copilot X improve to allow deeper IDE integration, chat capabilities, and extra context-aware code strategies.

Agentic visions

The new Codex goes far past its predecessor. Now constructed to behave autonomously over longer durations, Codex can write options, repair bugs, reply codebase-specific questions, run assessments, and suggest pull requests—every process operating in a safe, remoted cloud sandbox.

The design displays OpenAI’s broader ambition to maneuver past fast solutions and into collaborative work.

Josh Tobin, who leads the Agents Research Team at OpenAI, mentioned throughout a current briefing: “We think of agents as AI systems that can operate on your behalf for a longer period of time to accomplish big chunks of work by interacting with the real world.” Codex suits squarely into this definition. “Our vision is that ChatGPT will become almost like a virtual coworker—not just answering quick questions, but collaborating on substantial work across a range of tasks,” he added.

Figures launched by OpenAI present that the brand new Codex-1 SWE agent outperforms all of OpenAI’s newest reasoning fashions on inside SWE duties.

New capabilities, new interface, new workflows

Codex duties are initiated by means of a sidebar interface in ChatGPT, permitting customers to immediate the agent with duties or questions.

The agent processes every request in an air-gapped surroundings loaded with the consumer’s repository and configured to reflect the event setup. It logs its actions, cites take a look at outputs, and summarizes modifications—making its work traceable and reviewable.

Alexander Embiricos, head of OpenAI’s Desktop & Agents crew (and the previous CEO and co-founder of screenshare collaboration startup Multi that OpenAI acquired for an undisclosed sum final 12 months) mentioned in a briefing with journalists that “the Codex agent is a cloud-based software engineering agent that can work on many tasks in parallel, with its own computer to run safely and independently.”

Internally, he mentioned, engineers already use it “like a morning to-do list—fire off tasks to Codex and return to a batch of draft solutions ready to review or merge.”

Codex additionally helps configuration by means of AGENTS.md information—project-level guides that train the agent the best way to navigate a codebase, run particular assessments, and observe home coding kinds.

“We trained our model to read code and infer style—like whether or not to use an Oxford comma—because code style matters as much as correctness,” Embiricos mentioned.

Security and sensible use

Codex executes duties with out web entry, drawing solely on user-provided code and dependencies. This design ensures safe operation and minimizes potential misuse.

“This is more than just a model API,” mentioned Embiricos. “Because it runs in an air-gapped environment with human review, we can give the model a lot more freedom safely.”

OpenAI additionally reviews early exterior use circumstances. Cisco is evaluating Codex for accelerating engineering work throughout its product strains. Temporal makes use of it to run background duties like debugging and take a look at writing. Superhuman leverages Codex to enhance take a look at protection and allow non-engineers to counsel light-weight code modifications. Kodiak, an autonomous car agency, applies it to enhance code reliability and acquire insights into unfamiliar stack elements.

OpenAI can also be rolling out updates to Codex CLI, its light-weight terminal agent for native improvement. The CLI now makes use of a smaller mannequin—codex-mini-latest—optimized for low-latency modifying and Q&A.

The pricing is ready at $1.50 per million enter tokens and $6 per million output tokens, with a 75% caching low cost. Codex is at the moment free to make use of throughout the rollout interval, with fee limits and on-demand pricing choices deliberate.

Does this imply OpenAI IS NOT shopping for Windsurf? Thinking face emoji

The launch of Codex comes amid elevated competitors within the AI coding instruments house—and indicators that OpenAI is intent on constructing, relatively than shopping for, its subsequent part of merchandise.

According to current knowledge from RelatedWeb, visitors to developer-focused AI instruments has surged by 75% over the previous 12 weeks, underscoring the rising demand for coding assistants as important infrastructure relatively than experimental add-ons.

Reports from TechCrunch and Bloomberg counsel OpenAI held acquisition talks with fast-growing AI dev device startups Cursor and Windsurf. Cursor allegedly walked away from the desk; Windsurf reportedly agreed in precept to be acquired by OpenAI for a worth of $3 billion, although no deal has been formally confirmed by both OpenAI or Windsurf.

Just yesterday, the truth is, Windsurf debuted its family of coding-focused basis fashions, SWE-1, purpose-built to assist the complete software program engineering lifecycle, from debugging to long-running undertaking upkeep. SWE-1 fashions have been reported customized made, skilled fully in-house utilizing a brand new sequential knowledge mannequin tailor-made to real-world improvement workflows.

Many issues could also be occurring behind the scenes between the 2 firms, however to me, the timing of Windsurf launching its personal coding basis mannequin — as an alternative of its technique to-date of utilizing Llama variants and giving customers the choice to fit in OpenAI and Anthropic fashions — adopted someday later by OpenAI releasing its personal Windsurf competitor, appears to counsel the 2 will not be aligning quickly.

But alternatively, the truth that this new Codex AI SWE agent is in “research preview” to start out could also be a type of OpenAI pressuring Windsurf or Cursor or anybody else to return to the bargaining desk and strike a deal. Asked in regards to the potential for a Windsurf acquisition and reviews of 1 thereof, an OpenAI spokesperson advised VentureBeat that they had nothing to share on that entrance.

In both case, Embiricos frames Codex as excess of a mere code device or assistant.

“We’re about to undergo a seismic shift in how developers work with agents—not just pairing with them in real time, but fully delegating tasks,” he mentioned. “The first experiments were just reasoning models with terminal access. The experience was magical—they started doing things for us.”

Built for dev groups, not merely solo devs

Codex is designed with skilled builders in thoughts, however Embiricos famous that even product managers have discovered it useful for suggesting or validating modifications earlier than pulling in human SWEs. This versatility displays OpenAI’s technique of constructing instruments that increase productiveness throughout technical groups.

Trini, an engineering lead on the undertaking, summarized the broader ambition behind Codex: “This is a transformative change in how software engineers interface with AI and computers in general. It amplifies each person’s potential.”

OpenAI envisions Codex because the centerpiece of a brand new improvement workflow the place engineers assign high-level duties to brokers and collaborate with them asynchronously. The firm is constructing towards deeper integrations throughout GitHub, ChatGPT Desktop, difficulty trackers, and CI techniques. The long-term aim is to mix real-time pairing and long-horizon process delegation right into a seamless improvement expertise.

As Josh Tobin put it, “Coding underpins so many useful things across the economy. Accelerating coding is a particularly high-leverage way to distribute the benefits of AI to humanity, including ourselves.”

Whether or not OpenAI closes offers for opponents, the message is evident: Codex is right here, and OpenAI is betting by itself brokers to steer the following chapter in developer productiveness.

Daily insights on enterprise use circumstances with VB Daily

If you need to impress your boss, VB Daily has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for optimum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out extra VB newsletters right here.

An error occured.

[ad_2]

OpenAI launches preview of Codex AI SWE agent for builders

Codex’s evolution: from mannequin to autonomous AI coding agent

Agentic visions

New capabilities, new interface, new workflows

Security and sensible use

Does this imply OpenAI IS NOT shopping for Windsurf? Thinking face emoji

Built for dev groups, not merely solo devs

LEAVE A REPLY Cancel reply

ABOUT US

POPULAR POSTS

The Singularity Is Here! (According to Sam Altman, Who Was Definitely Not Just Showing Off)

Apophis is Back: A Once-in-a-Millennium Encounter

AI-Designed Miniproteins: A New Era of Precision Medicine for Hard-to-Treat Diseases

POPULAR CATEGORY

Codex’s evolution: from mannequin to autonomous AI coding agent

Agentic visions

New capabilities, new interface, new workflows

Security and sensible use

Does this imply OpenAI IS NOT shopping for Windsurf? *Thinking face emoji*

Built for dev groups, not merely solo devs

LEAVE A REPLY Cancel reply

ABOUT US

POPULAR POSTS

The Singularity Is Here! (According to Sam Altman, Who Was Definitely Not Just Showing Off)

Apophis is Back: A Once-in-a-Millennium Encounter

AI-Designed Miniproteins: A New Era of Precision Medicine for Hard-to-Treat Diseases

POPULAR CATEGORY

Does this imply OpenAI IS NOT shopping for Windsurf? Thinking face emoji