Claude 3.5 Sonnet. As of today, 23rd October 2024, this large language model (LLM) can now take over many functions on your computer. Yes, Claude AI can mimic human interactions with your PC, from moving the cursor to typing, clicking, and browsing.
This latest update, dubbed the “Computer Use” feature, makes it possible for Claude to control your system via simple commands. By analysing what's happening on your screen, Claude can automate tasks that previously needed your direct input. For example, it can extract information from one app (like a spreadsheet) and input it into another, such as an online form or document editor. In the demo shown by Anthropic, the AI was capable of filling out complex forms autonomously by pulling and processing data in real time.
So, how does it work? Claude relies on screenshots of your desktop and uses those visuals to understand what actions to take. The AI calculates how much to move the cursor or which keys to press based on what it "sees" on-screen. It’s not perfect currently. It can struggle with basic actions like scrolling and zooming, but it’s an impressive leap forward.
You can access this feature in beta via Anthropic’s API on platforms like Google Cloud’s Vertex AI and Amazon’s Bedrock. Developers are already experimenting with its capabilities to create tools that automate everything from simple admin tasks to app verification processes.
Users will still need to grant specific permissions, maintaining a level of control over what the AI can do. But, as Claude continues to evolve, it raises questions about how much autonomy we’re willing to hand over to AI systems and what safeguards are needed to prevent misuse. In short, this marks an exciting, slightly unnerving, moment in AI development. The question isn’t just “what can Claude do now?” but “how soon will it be doing even more?” Keep an eye on this space because, with capabilities like this, Claude is rapidly moving from assistant to autonomous operator.