Most AI compliance scanners are far less deterministic than users think. Paste a URL, wait 30 seconds, get a report. The workflow looks simple, but production behavior is not.
While improving ComplianceRadar.dev, we repeatedly hit one of the most common and most dangerous issues in automated compliance tooling: false positives in cookie consent detection.
Sites that clearly had consent banners were still flagged as missing a consent mechanism. That sounds like a small bug. In compliance systems, it is a trust problem.
Why This Happens
The naive implementation is everywhere: fetch HTML, send text to an LLM, and ask whether a consent banner exists. It works in demos and fails on modern websites.
Today's sites render consent UI dynamically, inject CMP components asynchronously, localize wording, isolate interfaces in iframes or shadow DOM, and change behavior by region and browser context. Static HTML alone often misses the real UI, and missing deterministic evidence pushes the model into guesswork.
The Distinction Many Scanners Miss
Two different conditions are frequently conflated:
- No consent UI exists.
- Consent UI exists, but tracking starts before consent.
These are not equivalent findings. A site can present a visible banner, offer reject controls, and still potentially load analytics or marketing scripts too early. Good compliance scanning must preserve that nuance.
Moving Beyond LLM Guessing
To reduce false positives, we changed the scanning pipeline by introducing a lightweight browser evidence layer.
UI Evidence
Banner detection, reject controls, CMP hints
Runtime Signals
Script timing, cookie activity, request behavior
Language Coverage
Multilingual consent snippets and vendor variants
The system now collects structured technical signals during the normal scan path, not only in fallback rendering. That shifts behavior from inference-first to evidence-first analysis.
The Hard Part Nobody Talks About
Detecting that a banner exists is comparatively easy. Determining whether tracking respects consent state is much harder.
- Is analytics loaded immediately?
- Do ad scripts fire before interaction?
- Are third-party cookies written before consent?
- Does reject behavior differ materially from accept?
Those are runtime questions. They require instrumentation, request observation, timing awareness, and strict timeout controls.
Why We Added Calibration Guardrails
We also observed occasional model overconfidence. Even with stronger evidence, outputs could still overstate conclusions such as absolute absence of consent mechanisms.
To prevent this, we added a calibration layer after model parsing. If the scanner sees consent UI evidence, CMP hints, or reject controls, hard "no mechanism exists" language is downgraded to findings such as possible pre-consent tracking, incomplete enforcement, or insufficient evidence.
We explicitly model uncertainty now. It is less flashy than confident claims, but materially better for long-term trust.
Multilingual Compliance Detection Is Underrated
Europe is multilingual, and consent systems reflect that. English-only detection misses valid interfaces in Croatian, German, French, Italian, Dutch, and mixed-language deployments.
We expanded multilingual consent patterns and added regression cases for delayed rendering, vendor-specific flows, and ambiguous evidence scenarios.
Operational Reality Matters Too
Better evidence is not free. Browser-level checks add latency, cost, timeout complexity, and concurrency pressure.
Our trade-off was lightweight probes, strict timeout budgets, best-effort cancellation, structured evidence caps, and scan deduplication to reduce repeated load under real traffic.
What We Learned
Compliance scanning is not a binary classification problem. It is a probabilistic evidence problem.
Deterministic signals matter. Calibration matters. Uncertainty handling matters. Operational engineering matters.
The goal is not to claim guaranteed compliance. The goal is to surface meaningful risk signals earlier, with higher consistency and lower false confidence.
Test your public architecture
Run a free compliance scan and see how your consent and tracking signals hold up under structured browser evidence.
Sources and further reading
- Regulation (EU) 2024/1689 (EU AI Act) — EUR-Lex
- EU AI Act vs GDPR
- The AI Act Audit: What Enterprise Buyers Actually Check
This article is informational and does not constitute legal advice.

