Podcast Transcript

Claude Code Security & What It Actually Means for AppSec

Chris Hughes of Resilient Cyber cuts through the hype around Anthropic's Claude Code Security launch — and explains why discovery was never the bottleneck.

📅 March 6, 2026 🎤 Resilient Cyber Podcast ⏱ ~7 min

Market Reaction

Cybersecurity Stocks in Free Fall

00:00 – 00:36

Last week, Anthropic dropped Claude Code Security, and within hours, the cybersecurity stocks were in free fall.

−8%

CrowdStrike

−10%

Okta

−25%

JFrog

−16%

Cyber ETF (YTD)

Twitter and LinkedIn lit up. People were calling it the end of AppSec as we know it.

"Anthropic just ate the entire $15 billion AppSec industry's lunch."

And look, I get the excitement, but the reaction was an overreaction, and I'll walk through why with a more nuanced take, because there's a lot that the hype is missing.

What It Is

What Claude Code Security Actually Does

00:36 – 01:21

It's a capability built into Claude Code that scans code bases for vulnerabilities and suggests patches. But unlike traditional SaaS tools that match code against known patterns, Claude reasons about your code contextually. It traces data flows, understands how components interact, and catches complex vulnerabilities that rule-based tools consistently miss — things like broken access control and business logic flaws.

Every finding goes through a multi-stage verification process, where Claude tries to disprove its own results before surfacing them, and nothing gets applied without human approval.

Key result: Using Opus 4.6, Anthropic found over 500 previously unknown high-severity vulnerabilities in production open-source code bases — bugs that had survived decades of expert review and fuzzing.

This matters because the vulnerability classes that actually get exploited in the wild are exactly the ones that legacy tools struggle with. Broken access control has been sitting at the top of the OWASP Top 10 for five years running. That's not a pattern matching problem. It's a reasoning problem. And LLMs are getting very good at reasoning.

False Positives

The 91–98% Reduction

01:21 – 02:07

Multiple studies have shown that when you couple traditional SaaS with LLMs as an intelligence triage layer, you can reduce false positives by 91 to 98%. That's not incremental. That's category changing.

Anyone who's working in AppSec knows that false positives are the number one reason developers stop trusting security tools. If you dump a spreadsheet of 500 findings on a dev team and most of them aren't exploitable, they stop looking at any of them.

So if LLMs can turn the mountain of noise into a handful of verified actionable findings, that alone is transformative. The research consensus is also pretty clear at this point: the winning approach isn't LLMs alone or SaaS alone. It's coupling them together so each compensates for the other's weakness.

The Paradox

Finding Vulns vs. Creating Them

02:07 – 03:17

But here's where we need to pump the brakes, because LLMs come with real limitations that the hype cycle is glossing over.

62%

AI code incorrect or vulnerable (ETH Zurich)

+55%

Vuln density increase, Opus 4.6 (SonarSource)

+278%

Path traversal risk increase

SonarSource published their analysis of Opus 4.6 the same day as the Claude Code Security announcement. So we have this paradox: the same model that is excellent at finding vulnerabilities in existing code is simultaneously introducing more vulnerabilities when it generates code.

Endor Labs research found that AI-generated code is vulnerable 25% to 75% of the time, and it gets worse as task complexity increases. These models are trained on open-source code that itself is full of vulnerabilities. As Rombad Warv from Endor Labs put it: if AI coding agents can handle security well, the code they would generate would be secured. But it's not.

Remediation

Discovery Was Never the Bottleneck

03:17 – 04:37

Here's the part that really gets me, and I've been writing about this for a long time. Everyone is celebrating better vulnerability discovery. But the discovery was never the bottleneck — remediation is.

48,000+

CVEs published in 2025

55 days

To patch half of known exploitable vulns

<5%

Of CVEs actually exploited per year

CVE volume has grown almost 500% over the last decade. 30% year-over-year growth. Every organization I talk to is drowning in vulnerability backlogs that are growing faster than they can be triaged, let alone fixed.

Vulnerability exploitation tripled — 180% growth — and by 2025 it overtook phishing as the primary attack vector. Meanwhile, it takes organizations 55 days to patch just half of known exploitable vulnerabilities. By year end, 10% are still open.

Organizations are burning enormous amounts of engineering time forcing developers to fix things that pose little actual risk. The root cause isn't technical. It's organizational.

Engineering teams are under relentless pressure to ship features, hit revenue targets, and maintain velocity. Security remediation competes directly with these priorities, and in most organizations, it loses. That's why backlogs bloom.

Adding a more powerful discovery tool into that equation without addressing the remediation bottleneck just makes the pile bigger.

Structural Gaps

What Claude Code Security Doesn't Address

04:37 – 05:48

Static analysis, whether rule-based or LLM-driven, still operates without runtime context. Datadog's research showed that only 20% of critical vulnerabilities are actually exploitable at runtime. Endor Labs' reachability analysis shows that 92% of vulnerabilities without proper context are purely noise.

Industry spent years "shifting left" by jamming noisy scanners in CI/CD pipelines and dumping findings without context onto developers. What matters is what's actually running in production. What's reachable? What's exploitable?

Independent verification problem: If the AI that writes the code then tells you the code is secure, who's checking it? Same model, same context window, same blind spots. You wouldn't let a developer sign off on their own security review — the same principle applies here.

And then there's enterprise governance — audit trails, compliance reporting, policy enforcement, role-based access control. James Berthoty from Latio Tech made a blunt observation that the capabilities Anthropic announced largely mirror what's already been in Claude Code. The enterprise workflow layer is where the real gap exists.

Conclusion

The Path Forward

05:48 – 06:52

Alan Cinnamon from Vial Adventures framed it well: Anthropic didn't kill cybersecurity. They validated that frontier AI is now a real participant in the security market. That makes the challenge more complex, not less.

We've seen this before with cloud security. Every CSP built native security tooling. It was supposed to eliminate standalone vendors — and yet Datadog built a $40 billion business, Wiz sold to Google for $32 billion. The platform providers validated the problem. The focused companies built the products that actually solved it.

The winning approach: LLMs becoming a powerful layer within AppSec, coupled with deterministic analysis for trust, reachability analysis for prioritization, runtime context for real-world risk, and enterprise governance for operational readiness.

The full stack is what wins — not a single model or a single tool, no matter how capable. The idea that we can keep applying the methods of the past — noisy tools, contextless findings, shift-left dogma — that really needs to die here. Not AppSec itself.

The frontier labs are in cybersecurity now. The vendors, practitioners, and organizations that adapt the fastest — leveraging these capabilities rather than competing against them — will define what AppSec looks like in the agentic era.

See how AppSecAI solves the remediation bottleneck.

Automated triage at 97% accuracy. Code fixes delivered as pull requests. Results in 30 minutes.

Schedule a Demo →