AI-code audit
By 2027, 40-60% of code submissions will be AI-generated. The regulated-industry compliance question - "which findings sit on AI-touched code?" - needs a real answer. Vulkro is the only SAST tool that ships a defender-side AI-origin audit today.
This page covers the three CLI surfaces that consume the audit:
- AI-origin fingerprinting (per-tool markers: Claude, Copilot, Cursor, Aider, ChatGPT, generic AI-generated)
--ai-code-segregationsegregation report--attest-reviewedreviewer attestation
Why this is on the SAST tool
The traditional security-tool answer is "we run the same detectors on every line". That's correct for technique but wrong for posture: AI-generated code is empirically more prone to specific failure modes - boilerplate auth that does not enforce, copy-paste secrets, missing error handling, hallucinated dependency names. The right place to surface "this finding is on AI-touched code" is inside the scanner that already knows about the finding.
Vulkro stays on the defender side: this module audits AI-generated code; it does not generate fixes via a vendor LLM. That matches the offline-first posture - your code never leaves the host.
How origin is detected
src/security/ai_origin.rs
scans each module for canonical per-tool markers:
| Tool | Markers it looks for |
|---|---|
| Claude | Generated by Claude, Generated with Claude, Co-Authored-By: Claude, @claude-generated, @anthropic-generated |
| GitHub Copilot | GitHub Copilot, Copilot-generated, @copilot |
| Cursor | cursor.sh, @cursor-generated |
| Aider | aider:, // aider |
| ChatGPT | Generated by ChatGPT, OpenAI GPT-... |
| Generic | AI-generated, Generated by AI, # generated by ..., // generated by ... |
The marker scan is strict precision today - if the source doesn't carry an explicit marker, the file is treated as hand-written. Heuristic shape detection (long boilerplate docstrings, uniform variable naming) is tracked for a follow-up; the strict marker scan is the safer first step because false-positive "this was written by Copilot" findings on hand-written files would be worse than the current under-coverage.
Marker line numbers are preserved so a per-file marker-ratio ("3 of 280 lines carry an AI marker") can be computed.
--ai-code-segregation
Adds a markdown report after the scan output:
$ vulkro scan . --ai-code-segregation
... (normal scan output) ...
# Vulkro AI-Code Segregation Report
- Total findings: 42
- Findings on AI-touched files: 11 (26.2%)
## Per-tool breakdown
- claude: 7 finding(s)
- copilot: 3 finding(s)
- generic-ai: 1 finding(s)
## AI-touched files (marker-line ratio)
- src/api/users.ts: 0.71%
- src/api/auth.ts: 1.43%
- src/services/notifier.ts: 0.36%
The report is emitted to stderr; the main output stream (table / JSON / SARIF) is unchanged so machine-readable formats stay parse-clean.
--attest-reviewed
Stamps every finding on an AI-touched file with a
human-reviewed-ai-code evidence row. Lets a downstream compliance
emit show "every AI-touched finding has a human-review sign-off"
without per-finding manual tagging.
$ vulkro scan . --attest-reviewed --reviewer jane@team
... (normal scan output) ...
✓ attest-reviewed: stamped 11 finding(s) on AI-touched files (reviewer: jane@team)
The reviewer name defaults to $USER; pass --reviewer NAME to
override. The attestation lives on evidence[].signal = "human-reviewed-ai-code" and evidence[].detail = "Reviewer attestation: AI-touched code reviewed by ...", so it persists into
SARIF, JSON, JUnit, and the database emit.
Use cases
HIPAA / PCI-DSS / FedRAMP audit evidence
The auditor asks "show me that every AI-touched line has been
reviewed by a human". Run vulkro scan . --ai-code-segregation --attest-reviewed --format sarif > scan.sarif. The SARIF carries
both the per-tool breakdown (properties rolling up segregation
counts) and the per-finding attestation evidence row. Hand the
auditor the SARIF file.
Pre-merge gate on AI-shop PRs
CI runs vulkro scan . --ai-code-segregation on every PR. The
stderr report goes into the PR's CI log; if the AI-touched-share
exceeds the team's threshold, a wrapping script fails the build and
posts the report as a PR comment.
Forensics after a security incident
A security event surfaces in production. Run
vulkro scan . --ai-code-segregation against the commit that shipped
the bug. The output tells you which tool produced the offending
code, so the team can update its tool-specific reviewer playbook.
Related
vulkro scan- the full flag list- Confidence model - how the attestation rows feed into the confidence aggregator