
Why we still do pure-manual penetration testing
Automated scanners have improved. Pentest-as-a-service platforms are everywhere. Here's why our senior engineers still test by hand — and what they catch that scanners don't.
By AnySec Engineering
The pitch we keep hearing
"Pentest in a box." "Continuous pentest as a service." "Always-on AI-assisted vulnerability discovery."
The pitch is seductive: you pay a monthly subscription, agents crawl your infrastructure, AI prioritises findings, your developers see prioritised tickets in JIRA, and you stop paying boutique pentest firms €15–€50K for the annual report.
The pitch is also mostly nonsense for high-stakes targets. Here's why we still build Penetration Testing around senior engineers testing by hand, and what specifically they catch that the automated platforms don't.
What automated platforms catch well
Let's give them their due. Automated platforms are genuinely good at:
- Known CVE detection. If you're running an outdated WordPress plugin, an automated scanner will find it within hours.
- Common misconfigurations. S3 buckets, GCP IAM, AWS security groups, IIS defaults, exposed admin interfaces.
- Surface enumeration. Subdomain discovery, port scanning, certificate transparency monitoring.
- Regression detection. Did a previously-fixed finding come back? Automated platforms shine here.
- Cheap broad coverage. If your attack surface is large, automated tools cover it cheaply.
A mid-tier automated platform will catch maybe 60% of what a junior pentester would catch. That's genuinely useful — at the right price, for the right use case.
What automated platforms miss
The remaining 40% includes most of what we care about for serious targets. Specifically:
Business logic flaws
The classic example: a casino's "first deposit bonus" endpoint that's supposed to fire once per account. The automated scanner sees a normal POST request that returns 200. It doesn't know the business meaning of "once per account" so it can't test the negative case. A human pentester sees the endpoint, reads the response body, notices the bonus_amount field, and within minutes is testing whether the bonus_id can be replayed, whether KYC bypass changes the response, whether timing the request next to a withdrawal makes a difference.
Last quarter we found a single business logic flaw that, if exploited, would have allowed unlimited bonus accumulation against a major operator. The automated scanner they were running before us had reported "no critical findings" against the same endpoint for 18 months.
Chained exploits across endpoints
Automated platforms find findings. Humans find narratives. A real attack narrative looks like:
"An unauthenticated user can register an account using only an email (finding 3, low severity). The registration creates a session that survives a logout (finding 7, medium severity). The session, when paired with a forgotten password flow that doesn't invalidate active sessions (finding 12, low severity), allows an attacker to maintain access after the victim resets their password. Combined impact: persistent account takeover via three individually-medium findings."
Each finding by itself would be a yellow tile in a dashboard. Combined they're an existential risk. Automated platforms don't do this kind of chaining because each finding lives in its own scanner module.
Authentication and session edge cases
A human will spot that an OAuth state parameter isn't checked for randomness, that the JWT verification accepts the none algorithm under certain code paths, that the password reset token survives across IP changes when it shouldn't.
Automated platforms have rules that match some of these patterns. But novel patterns — and authentication code is full of novel patterns — slip through.
Race conditions
The classic withdrawal race condition: send two concurrent withdrawal requests, get the same balance withdrawn twice. The classic bonus race condition: send two concurrent first-deposit claims, get the bonus credited twice. Automated scanners almost never catch these because they require multiple parallel sessions exploited at sub-second timing.
We found a race condition last year in a regulated crypto exchange's deposit confirmation flow that would have credited a single on-chain deposit to two accounts. Hand-crafted, ten minutes of careful curl scripting.
Anything requiring product context
When we test a casino's bonus flow, we need to know what the legitimate bonus rules are to spot the bypass. When we test an exchange's withdrawal flow, we need to know what the legitimate withdrawal flow is to spot deviation. An automated platform has no access to this context — it doesn't even have access to the documentation you'd give a human tester.
Authenticated post-auth surface
Almost all automated platforms test the unauthenticated surface well and the authenticated surface poorly. The reason is mundane: authenticated testing requires session management, and session management at scale across thousands of customers is a hard engineering problem nobody has fully solved. The platforms that do offer authenticated scanning tend to cover only the most generic post-auth patterns.
The interesting findings are almost always post-auth.
Anything novel
Every new framework, every new SaaS integration, every new business model creates a new class of vulnerabilities. By definition, automated platforms catch what they've been trained on. They lag behind real-world attacker novelty by 12–24 months. Human pentesters lag by however long it takes them to read a recent blog post.
What this means in practice
We sell Penetration Testing at €2,499 for a focused engagement and €5,999 for full-stack. The pure-manual delivery means a senior engineer spends 5–10 days probing your stack, writing exploit chains, drafting a developer-actionable report.
A pentest-as-a-service subscription at €1,500/month is the same annual cost but you'll find one or two real findings per quarter, all of which would have been visible to a scanner anyway, none of which would be the business-logic flaw that actually puts your operation at risk.
Both approaches have a place. If you're a Series A SaaS with a small attack surface and limited budget, the PtaaS subscription is fine — better than nothing, cheap, continuous. If you're a regulated casino, exchange, or bank with real money at stake, you want senior engineers reading your code by hand at least annually.
We typically recommend our clients run quarterly vulnerability assessments (broad coverage, partly automated, manually validated — see Vulnerability Assessment) and annual full penetration tests (deep, manual, chained). The combination catches the broad-but-shallow surface and the narrow-but-deep business logic that actually matters.
What we don't do
We don't claim to be 100% manual when we're not. Modern pentesting uses tools — Burp Suite Pro, Nuclei, custom scripts, language-specific static analysers. The "manual" claim is about who is in charge of the test, not whether tools are used.
When we say pure-manual, we mean:
- A senior engineer reads your app and decides what to test.
- They use tools to scale their reach (typing curl 4,000 times by hand would be silly).
- Every finding is hand-validated before it goes in the report.
- The report is written by a human who can answer "how did you find this" with a real story.
That's the standard. Anything below that is a scanner with extra steps.
If you want a real pentest done that way, book a scoping call — we'll be honest on the call about whether your situation actually needs senior-engineer manual work, or whether a cheaper subscription would serve you fine.