Skip to main content

Output formats

vulkro scan, vulkro discover, and most other commands take a --format flag (alias -f) that selects the emitted payload. Every format is built from the same in-memory ScanReport so they're all consistent - the only thing that changes is the rendering.

Quick reference

FormatWhat it's forFile extension
tableDefault. Colourised summary on the terminal.-
jsonMachine-readable. Pipe to jq, save to disk..json
sarifGitHub Code Scanning, VS Code, Azure DevOps. SARIF 2.1.0 with partialFingerprints + codeFlows + per-rule helpUri..sarif
gh-prOne PR summary comment. Markdown..md
gh-pr-inline-commentsOne per-finding inline review comment per line. NDJSON ready to pipe to gh api..ndjson
github-annotationsOne GitHub Actions / GitLab CI workflow-command line per finding (::error file=...,line=...,title=<rule id>::<message>). Printed to a CI job's stdout, each line becomes an inline PR annotation on the diff.-
junitGitLab MR test report, Jenkins, etc..xml
csvSpreadsheet hand-off..csv
cyclonedxCycloneDX 1.6 SBOM (JSON). Package inventory (purl, version, scope) plus a vulnerabilities[] block and a root dependencies edge..json
cbomCycloneDX 1.6 Cryptographic Bill of Materials (JSON)..json
spdxSPDX 2.3 SBOM (JSON). Package inventory with DESCRIBES relationships..json
pdfExecutive HTML rendered to PDF. Requires wkhtmltopdf on PATH..pdf
ropa-mdGDPR Art. 30 Records of Processing - Markdown..md
ropa-htmlGDPR Art. 30 Records of Processing - HTML..html

Specification versions

The standards-based formats emit a specific spec version. The table below is generated from the emitter source (src/output/sarif.rs, src/output/sbom.rs) via vulkro formats --format markdown and is held in sync by tests/format_docs_drift.rs - if a version bumps in code and this table is not regenerated, the build fails. Tooling can read the same catalogue as JSON with vulkro formats --format json.

Format--formatSpec versionSpecificationExtension
SARIFsarif2.1.02.1.0.sarif
CycloneDX SBOMcyclonedx1.61.6.json
CycloneDX SBOMcyclonedx-1.71.71.7.json
CycloneDX CBOMcbom1.61.6.json
CycloneDX CBOMcbom-1.71.71.7.json
SPDX SBOMspdxSPDX-2.3SPDX-2.3.json
SPDX SBOMspdx3SPDX-3.0.1SPDX-3.0.1.json
OpenVEXopenvex0.2.00.2.0.json
CycloneDX VEXcyclonedx-vex1.61.6.json
JUnit XMLjunitAnt/Surefire (de-facto)Ant/Surefire (de-facto).xml

Examples

vulkro scan . --format json | jq '.findings[] | select(.severity == "Critical")'
vulkro scan . --format sarif > vulkro.sarif
vulkro scan . --format gh-pr > comment.md
gh pr comment "$PR" --body-file comment.md
vulkro scan . --format cyclonedx > sbom.json
vulkro scan . --format ropa-md > ropa.md

SARIF specifics

  • Spec version 2.1.0. The emitted $schema points at the immutable OASIS-published standard (.../sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json), not a mutable Git branch.
  • Each finding maps to one result with ruleId, ruleIndex, level, rank (0..100 mapped from Severity), message, locations, and properties (carrying confidence, confidence_reason, compliance_controls, plus the vulkro.fix / vulkro.fixes remediation strings for autofix-aware consumers).
  • tool.driver.rules enumerates every detector that contributed to this scan, with descriptions.
  • partialFingerprints populated per result with two versioned keys:
    • vulkro/v1: SHA-256 of {rule_id}|{owasp_category}|{file}| {message_head}. Stable across line drift; lets GitHub Code Scanning / Sonar / DefectDojo dedupe a finding across runs even when unrelated edits shift its line number.
    • vulkro/locHash: SHA-256 of {file}:{line}. Exact identity; useful for joining against a fresh scan of the same commit.
  • relatedLocations[] materialised from SecurityFinding.trace. Every trace hop becomes a labelled location in the related- locations panel of the SARIF consumer (sanitizer gap, missing auth middleware, the source of the taint, etc.).
  • codeFlows[].threadFlows[].locations[] built from trace items whose kind == Taint. Renders as the data-flow ladder view in GitHub Code Scanning and matches how Semgrep / CodeQL show taint findings today. Cross-method Python and JavaScript / TS findings from the language-neutral taint engine populate this automatically.
  • Compatible with GitHub Code Scanning's SARIF uploader; the vulkro/v1 fingerprint replaces GitHub's auto-derived fingerprint so dedup is deterministic.

gh-pr-inline-comments specifics

NDJSON, one object per line. Each line is the shape gh api repos/:owner/:repo/pulls/:N/comments wants:

{
"path": "src/auth.ts",
"line": 42,
"side": "RIGHT",
"severity": "high",
"rule_id": "OWASP-API1",
"fingerprint": "<sha256>",
"body": "**vulkro [High]** (OWASP-API1): ...\n\n**Fix.** ...\n\n<!-- vulkro:fingerprint:HASH -->"
}

The repo ships a companion script integrations/gh-cli/post-inline-comments.sh that reads the stream and POSTs each comment via gh api. The only GitHub scope required is pull_requests: write, available on the default GITHUB_TOKEN that actions/checkout mints. No GitHub App middleman.

Behaviour when piped from vulkro gate:

  • Only new findings (vs the baseline) are emitted. Pre-existing tech debt does not flood a PR with first-time inline comments.
  • Findings without a concrete file or with line == 0 are silently skipped: GitHub's review-comment API rejects them and a failed POST would noise the Actions log.
  • The body carries an HTML-comment fingerprint footer (<!-- vulkro:fingerprint:HASH -->) so the poster can dedupe comments across runs. The fingerprint matches the SARIF vulkro/v1 value so an inline comment can be joined back to its SARIF row by id.

github-annotations specifics

One GitHub Actions workflow command per finding, newline-terminated:

::error file=src/auth.ts,line=42,endLine=44,col=1,title=OWASP-API1::IDOR on /users/:id - rebind the lookup (https://vulkro.com/docs/rules/broken-object-level-auth)

When these lines are printed to a GitHub Actions job's stdout, GitHub turns each into an inline annotation pinned to file:line on the Files Changed tab of the PR (and on the job summary). ::error / ::warning / ::notice set the annotation severity (Critical / High become error, Medium becomes warning, Low / Info become notice). GitLab CI's annotation parser reads the same grammar, so one --format github-annotations lane drives PR annotations on both hosts.

CI wiring (minimal GitHub Actions step):

- name: vulkro PR annotations
run: vulkro scan . --format github-annotations

Region precision mirrors SARIF: line is the 1-based start line, endLine is emitted when the finding spans a block, and the message tail carries the rule's helpUri so a reviewer can click through to the rule docs. Findings without a concrete file or with line == 0 are skipped (GitHub drops annotations that lack a location). When piped from vulkro gate only new findings (vs the baseline) are annotated, so a first run does not paper the PR.

CycloneDX specifics

  • Spec 1.6 JSON (schema). The 1.6 bump lets the cbom format reuse the same emitter for cryptographic-asset components.
  • components[] lists every statically resolved dependency as a library component with a stable bom-ref, purl, name, version, and scope (required / optional, derived from dev-dependency status).
  • vulnerabilities[] carries one entry per matched (package, CVE) pair, sourced to OSV, with a ratings[] severity and an affects[].ref pointing back at the vulnerable component's bom-ref. The array is present-but-empty when the scan found no dependency CVEs (the "scanned, none found" signal), so consumers can tell that apart from "not scanned".
  • dependencies[] records the one edge Vulkro can prove - the root application depends on every resolved package. Vulkro does not resolve the transitive dependency tree, so leaf sub-graphs are intentionally left unstated rather than fabricated.
  • Licences emit the SPDX sentinel NOASSERTION: the dependency model does not yet carry per-package licence data.
  • cbom is a separate CycloneDX 1.6 document of cryptographic-asset components only (MD5, SHA-1, ECB, RC4, DES, static IV, insecure RNG) with file:line occurrences. Use it for FedRAMP / post-quantum reviews alongside the library SBOM.

SPDX specifics

  • Spec version SPDX-2.3 (schema), JSON.
  • One SPDXRef-Package-N per resolved dependency, each with a purl external reference.
  • The document declares a top-level DESCRIBES relationship to every package (both the canonical relationships[] form and the documentDescribes array), so strict validators (the SPDX online tool, pyspdxtools) accept it.
  • licenseConcluded, licenseDeclared, copyrightText, and downloadLocation are NOASSERTION for the same model-fidelity reason as CycloneDX above.

RoPA (GDPR Art. 30)

The Record of Processing Activities format is intended as a starting point for a GDPR audit pack. It enumerates:

  • Each endpoint that handles personal data (PII detected by the privacy engine - see Privacy).
  • The detected categories of personal data per endpoint.
  • The stated purpose, retention, and lawful basis (you fill these in).
  • The controls Vulkro detected as in place / missing.

ropa-md is best for Git review; ropa-html is best for emailing to a DPO who doesn't read Markdown.

PDF

PDF rendering shells out to wkhtmltopdf. If the binary isn't on PATH, the format errors out cleanly with a hint. The PDF source is the same executive HTML report vulkro report produces.