Validating Exploit Paths, Not Cataloging Vulnerabilities
- May 14, 2026
- Michelangelo Sidagni
CVE-2021-26855 — the original ProxyLogon — sits in most external pen test scopes as a check-the-box concern. The scanner has it flagged on an Exchange server. CVSS 9.8. Your patch management dashboard shows it remediated. Done?
Maybe. The scanner confirms a vulnerability is present. The pen test confirms whether the chain underneath it goes anywhere in your environment.
ProxyLogon is a server-side request forgery vulnerability in Microsoft Exchange Server that opens the door to authenticated arbitrary file write, which leads to RCE — Remote Code Execution. Useful starting point.
The real test starts after:
None of those questions are answered by the scanner. Most are not answered by the patch management dashboard either. They are answered by someone with a foothold and a few hours.
A CVE record tells you a software version contains a known vulnerability. That is the floor. It says nothing about whether the vulnerability is reachable, exploitable in your runtime, caught by your controls in production, or chains into anything that matters. This is why MITRE keeps ATT&CK as a framework distinct from CVE. CVE classifies the weakness. ATT&CK classifies what the attacker does — the techniques used to exploit, escalate, move, persist, exfiltrate. The two frameworks describe different layers of the same attack surface, and a CVE on its own does not map automatically to the ATT&CK framework TTPs (Tactics, Techniques, and Procedures).
ProxyLogon’s chain on a hardened, segmented Exchange deployment with mature EDR is a different finding from the same CVE on a flat network with default Exchange roles and no detection. Same CVE. Two different exposures. The CVE is the door. The chain is whether the door leads anywhere.
Severity scores rate the abstract characteristics of a vulnerability. They do not factor in compensating controls, asset criticality, network reachability, or active exploitation in the wild. NopSec research has consistently shown that the high-CVSS celebrity vulnerabilities that dominate news cycles are a small percentage of the vulnerabilities that actually get exploited.[2] Functional exploits and active malware live elsewhere — less common across the full set of published CVEs, but very common across the assets that matter. A team that prioritizes by CVSS alone ends up patching things that pose limited real-world risk in their environment, while real exploit paths sit unaddressed because no individual finding in the chain looked critical on its own.
The validation work in the middle of a pen test is what makes the engagement take weeks and cost five figures. Senior tester capacity is finite. Engagements get scoped to a one- or two-week window because that is what the labor supply supports, and the validation work does not compress on demand. Sitting with a finding for an hour, failing, trying a different approach, writing a proof-of-concept when no off-the-shelf exploit fits the conditions in front of you — none of that parallelizes. The procurement cycle runs annually because that is how vendor agreements work, and the labor supply behind it has set the cadence as much as risk has.
The result is a report every twelve months describing what got tested in the time available, with the implicit “and here is what we did not have time to look at” part left out.
NopSec Adversarial Simulation is the validation work, scaled.
The same five-phase methodology our pen testing team has run for eighteen years — reconnaissance, vulnerability enumeration, exploitation, privilege escalation, report generation — executed by purpose-built agents under human oversight. The agents do the parts that do not compress: reachability checks, exploit attempts against confirmed vulnerabilities, on-the-fly proof-of-concept generation when nothing off-the-shelf fits, attack chaining the way an attacker would actually walk it.
Human-on-the-loop is structural. Scope confirmation, exploitation approval, finding validation, and report sign-off all flow through a human reviewer before anything reaches the customer. When the auditor asks who decided a finding was a false positive, the answer is a person, not a model.
The output is a real pen test report. Scope, methodology, findings with severity ratings, exploitation paths, remediation guidance, full attack log. The kind of report a senior tester would produce after a manual engagement, generated faster and on the cadence your environment actually changes by. After remediation, the same engagement re-runs and confirms the path is closed.
$2,999 per test through the Pen Test Starter tier. Early access opens May 19. Inside sales works directly with each customer to scope the first test.
The price reflects what audit-grade external validation costs when eighteen years of methodology is engineered into automation and human oversight is built into the workflow rather than billed by the hour. The test itself contains what a manual engagement would contain. That part we held the line on.
We’re opening the list soon. Stay stuned!