Forget chatbots—OpenAI may be working on AI "agents" that can autonomously complete tasks directly on your devices. According to a new report from The Information, OpenAI is building two types of agents: one that can take over a user's device to carry out complex workflows, and another focused on gathering data and completing web-based tasks.
The computer-using agents would complete tasks by effectively assuming control of a user's device—performing the "clicks, cursor movements, text typing and other actions humans take as they work with different apps." For example, the agent could transfer data between documents, populate expense reports, or perform other repetitive jobs involving multiple applications. Unlike ChatGPT which operates entirely in the cloud, portions of this new agent may reside on local devices.
OpenAI's other efforts center on agents that can complete web-based tasks. Such an agent might aid in "gathering public data about a set of companies, creating itineraries under a certain budget or booking flight tickets." Google and Meta are reportedly also working on similar projects.
AI agents has been one of the hottest areas of development within the AI community over the last year. Startups like Adept AI are building their entire business around them. This widespread interest underscores the transformative impact that AI agents could have in automating complex, unstructured tasks that require a nuanced understanding of human intent and interaction.
However, highly capable acting autonomous AI agents having extensive device control raises understandable concerns around privacy, safety and security. Users will not only need to explicitly permit access and approve actions, but thoughtful guardrails will need to be put in place to prevent critical missteps.