Claude Code Flaws

-PSA-

AI red teaming used to be about a "don't do this, do that" scenario. It was kinda like a jedi mind trick of these aren't the droids you're looking for, instead look at this trash can, its a droid, I swear.

Most of those style attacks don't seem to work anymore and the websites and subreddits that hosted them are now banned and the models have been patched.

So naturally, attackers do what they always do, and adapt.

But it turns out, if you drop instructions into CLAUDE.md, Claude Code just accepts it and continues on. It's not a bug, its a feature that is working exactly as designed and is very much worse.

This guy, used vibe coding methods, to build a C2 malware network that threat analysts claimed it was real, and thought to be from an entire team. They only caught him because he made a rookie mistake. The code was great, that's the problem.

The do's and do-not:

Treat project templates like supply chain risk.
Don't give your coding agent prod access.
Don't give your web application outbound network access (or restrict it to only the sources it needs).