OpenAI is offering a select group of users the chance to test a new feature in ChatGPT that leverages artificial intelligence to navigate the web for tasks such as booking trips, buying groceries, searching for deals, and handling various online activities.
This innovative tool, known as Operator, functions as an AI agent. It utilizes a model trained on both textual and visual data to interpret user commands and determine how to navigate a web browser to fulfill those requests. OpenAI asserts that this technology could streamline many everyday tasks and job-related errands.
The introduction of OpenAI’s Operator follows similar advancements from Google and Anthropic, which have shown AI agents capable of browsing the internet. These AI agents are generally regarded as the next step in the evolution of artificial intelligence, succeeding traditional chatbots. Numerous companies have jumped on this trend, promoting their versions, although many are limited and merely use language models to automate tasks typically performed by standard software.
“AI is transitioning from being a tool that provides answers to one that can take action in the real world, executing complex, multi-step processes,” explains Peter Welinder, VP of product at OpenAI. “We anticipate significant effects on individuals’ productivity—and the overall quality of work they can achieve.”
OpenAI acknowledges that enabling ChatGPT to access a web browser brings about new risks and admits that Operator may occasionally misbehave. The company has established various safeguards and intends to gradually increase Operator’s capabilities.
Welinder and Yash Kumar, product and engineering lead for OpenAI’s Computer Using Agent, emphasize that the strategy involves learning from users’ interactions with the tool. They recognize the possibility of accidental bookings or purchases but assert that considerable efforts have been made to ensure it seeks confirmation before taking any significant actions. “It will return to me for approvals before executing potentially irreversible actions,” Kumar states.
Today, OpenAI also introduced a new “system card” detailing possible issues linked to Operator. These may include misunderstandings of commands or straying from user intentions; potential misuse by individuals; or exploitation by cybercriminals.
“This also brings a host of safety concerns,” notes Kumar. “The risk and attack areas expand significantly.”
Operator will first be accessible as a “research preview” to ChatGPT Pro account users, which costs $200 per month. The company intends to gradually broaden access as the tool is slowly rolled out, acknowledging that it will make some errors along the way.
In various demonstrations, Operator exhibited the capability of AI to take on a more proactive role as an online assistant. The interface includes a remote web browser alongside a chat window for user communication.
When requested by WIRED, Operator was tasked with booking an Amtrak train journey from New Haven, Connecticut, to Washington, DC. It navigated to the correct website, inputted the required information accurately to display the timetable, and then solicited further guidance. If a user were logged into the Amtrak website or a browser profile with saved payment information, Operator could proceed with the ticket purchase—although it is programmed to seek permission beforehand.
Kumar instructed Operator to reserve a table at Beretta, a restaurant in San Francisco. The tool accessed the OpenTable website, located the appropriate restaurant, and checked availability before asking for next steps. OpenAI has partnered with several popular platforms, including OpenTable, to ensure Operator functions seamlessly with them.
This new tool is based on OpenAI’s GPT-4o AI model, which can perceive a web browser and web pages while engaging in typed conversations. The tool also integrates additional training to enhance its understanding of executing online tasks. OpenAI plans to offer its Computer Use Agent through an API as well.