News World

Anthropic says Chinese state hackers deployed AI for autonomous attacks

Dutch warning over 'annoying' chatbots
Source: Unsplash

Artificial intelligence company Anthropic has detected and disrupted what it described as the first documented cyber espionage campaign conducted largely autonomously by AI, marking a significant escalation in machine learning-enabled attacks.

The operation, attributed to a Chinese state-sponsored group designated as GTG-1002, manipulated Anthropic’s Claude AI system to spy on and steal data from approximately 30 targets with minimal human intervention, according to a company report released Thursday.

The campaign, detected in mid-September, targeted major tech companies, financial institutions, and government agencies across multiple countries.

Anthropic said the attackers used Claude Code, its computer programming product, to autonomously conduct 80 to 90 percent of the campaign’s activity at speeds impossible for human operators.

“This represents a fundamental shift in how advanced threat actors use AI,” the company said.

“Rather than merely advising on techniques, the threat actor manipulated Claude to perform actual cyber intrusion operations with minimal human oversight.”

California-based Anthropic was launched in 2021 by former OpenAI staff and positions itself as prioritizing safety in AI development. Its flagship product is the Claude chatbot.

The disclosure comes amid growing concern about AI’s role in cyber warfare.

Claude and rival chatbots, including OpenAI’s ChatGPT and Google’s Gemini, have been used to automate cyber attacks, but Anthropic’s report detailed the first known case of a generative AI model being left to carry out operations independently.

“The barriers to performing sophisticated cyberattacks have dropped substantially,” the company warned.

The attackers bypassed Claude’s safety mechanisms by convincing the AI they were legitimate cybersecurity professionals conducting authorized testing, according to the company.

Humans maintained strategic oversight, but the AI independently executed complex cyberattacks over multiple days without detailed guidance, the report said.

The sustained campaign eventually triggered the company’s built-in detection systems.

In a notable admission, Anthropic said Claude’s AI spies frequently overstated findings and occasionally fabricated data — claiming to have obtained credentials that did not work or identifying publicly available information as critical discoveries.

Such AI hallucinations remain a persistent concern across the technology.

Upon detection, Anthropic banned the associated accounts, notified affected entities and authorities, and implemented enhanced detection capabilities.

The company defended its decision to continue developing powerful AI systems despite misuse, arguing the same capabilities enable defense against bad actors.

“When sophisticated cyberattacks inevitably occur, our goal is for Claude to assist cybersecurity professionals to detect, disrupt, and prepare for future versions of the attack,” it said.

Anthropic said it plans to release regular reports on detected attacks and called for increased industry data sharing, improved detection, and stronger safety controls across AI platforms.

“We’re sharing this case publicly to contribute to the work of the broader AI safety and security community,” the company said.

Tags

About the author

AFP

Agence France-Presse (AFP) is a French international news agency headquartered in Paris, France. Founded in 1835 as Havas, it is the world's oldest news agency.

Add Comment

Click here to post a comment