
The latest in OpenAI's growing lineup of AI agents has officially arrived, and this one's aimed squarely at developers. Codex, launched today as a research preview, brings the company's AI prowess to software engineering by running multiple coding tasks in parallel cloud environments. The system is powered by codex-1
, a version of OpenAI's reasoning-focused o3 model that's been specifically optimized to produce code that more closely mirrors human style.
Key Points:
- Codex runs entirely in the cloud with each task in its own sandbox environment
- It was trained using RL on real coding tasks to match human style and PR preferences
- Available today for ChatGPT Pro, Enterprise, and Team users, with Plus and Edu access coming soon
"Software engineering is changing, and by the end of 2025, it's going to look fundamentally different," said Greg Brockman, OpenAI's President and co-founder, during the announcement. "This is a step towards where we think software engineering is going."
Unlike other coding assistants that typically function as autocomplete tools, Codex operates more like an asynchronous colleague. You can assign tasks by clicking "Code" or ask questions about your codebase with "Ask." Each task runs independently in isolated environments preloaded with the relevant repository. The agent can read and edit files, run commands including test harnesses, linters, and type checkers — all tasks that typically take between 1 and 30 minutes to complete.
Perhaps most impressive is how Codex provides verifiable evidence of its actions through citations of terminal logs and test outputs, allowing users to trace each step taken during task completion. This transparency becomes increasingly important as AI models handle more complex coding tasks independently.
The system can also be guided by AGENTS.md
files within repositories — similar to README files — that provide instruction on how to navigate codebases, which commands to run for testing, and how to adhere to specific project standards. According to OpenAI, codex-1 shows strong performance on coding evaluations and internal benchmarks even without these guidance files.
Early adopters like Cisco, Temporal, Superhuman, and Kodiak have already been testing the system. Temporal reported using Codex to accelerate feature development, debug issues, write and execute tests, and refactor large codebases, while Superhuman found it useful for speeding up repetitive tasks and enabling product managers to contribute lightweight code changes without requiring an engineer except for review.
OpenAI has also updated its Codex CLI, the lightweight open-source coding agent launched last month that runs in local terminals. Today's update includes a smaller version of codex-1 — essentially a version of o4-mini designed specifically for CLI use — optimized for low-latency code Q&A and editing while retaining strengths in instruction following and style.
For developers concerned about security, OpenAI emphasizes that the Codex agent operates entirely within secure, isolated containers in the cloud. During task execution, internet access is disabled, limiting interaction solely to the code explicitly provided via GitHub repositories and pre-installed dependencies configured by users.
"Our goal really is to help accelerate the useful work that is done, to help there be more software engineers in the world, so that there's more programming that is done that is useful and able to move forward the world," Brockman explained.
Of course, Codex isn’t perfect. It can take minutes—sometimes longer—for remote tasks to complete, and you’ll still want to eyeball every change before merging. But as model capabilities and infrastructure improve, OpenAI sees a future where developers offload ever-larger chunks of work—debugging sessions, cross-repo refactors, even CI failure resolutions—to AI colleagues.
Codex is rolling out today to ChatGPT Pro, Enterprise, and Team users globally, with generous access at no additional cost for the coming weeks. After this initial period, OpenAI plans to implement rate-limited access and flexible pricing options for additional usage. For developers building with the smaller codex-mini-latest model available on the Responses API, pricing is set at $1.50 per 1 million input tokens and $6 per 1 million output tokens, with a 75% prompt caching discount.
“Codex feels like a co-worker with its own computer,” OpenAI’s Greg Brockman said in the launch demo. “You ask it to run tests or fix typos, and it just does it while you keep coding or grab lunch.” It’s a striking vision: a hybrid of real-time pairing and asynchronous delegation that could redefine engineering workflows.