Yesterday, California-based AI firm Adept announced Action Transformer (ACT-1), an AI model that can perform actions in software like a human assistant when given high-level written or verbal commands. It can reportedly operate web apps and perform intelligent searches on websites while clicking, scrolling, and typing in the right fields as if it were a person using the computer.
In a demo video tweeted by Adept, the company shows someone typing, “Find me a house in Houston that works for a family of 4. My budget is 600K” into a text entry box. Upon submitting the task, ACT-1 automatically browses Redfin.com in a web browser, clicking the proper regions of the website, typing a search entry, and changing the search parameters until a matching house appears on the screen.
1/7 We built a new model! It’s called Action Transformer (ACT-1) and we taught it to use a bunch of software tools. In this first video, the user simply types a high-level request and ACT-1 does the rest. Read on to see more examples pic.twitter.com/mq7c0Vyd7N
— Adept (@AdeptAILabs) September 14, 2022
Another demonstration video on Adept’s website shows ACT-1 operating Salesforce with prompts such as “add Max Nye at Adept as a new lead” and “log a call with James Veel saying that he’s thinking about buying 100 widgets.” ACT-1 then clicks the right buttons, scrolls, and fills out the proper forms to finish these tasks. Other demo videos show ACT-1 navigating Google Sheets, Craigslist, and Wikipedia through a browser.