Steampipe integration
| Tool URL | https://steampipe.io |
| License | AGPL-3.0 |
| Category | Cloud / SaaS inventory + compliance via SQL |
| Rookery attestor used today | steampipe (native) |
| Predicate type | https://aflock.ai/attestations/steampipe/v0.1 |
| Validated example | 31-steampipe |
Steampipe is Turbot's SQL engine for cloud and SaaS APIs — query AWS, GCP, Azure, Kubernetes, GitHub, Okta, Slack, and dozens of other providers as if they were Postgres tables, then layer on compliance mods (steampipe-mod-aws-compliance, steampipe-mod-aws-cis, steampipe-mod-kubernetes-compliance) that ship FedRAMP / CIS / NIST / HIPAA benchmarks as SQL. CI/lock wraps each query so the JSON rows become a signed v0.1 Steampipe attestation with per-row subject digests (aws:account:<id>, github:repo:<owner/name>, k8s:uid:<uid>, etc.) — evidence a Rego gate can match against attestationsFrom.
Validated invocation
cilock run --step steampipe-query \
--signer-file-key-path key.pem \
--outfile attestation.json \
--attestations steampipe,environment,git \
--enable-archivista=false \
-- sh -c 'steampipe query --output json "select arn, name from aws_iam_user limit 5" > steampipe.json'
Two Steampipe-specific quirks are baked into that command — both matter for clean attestations:
--output jsonis required. Thesteampiperookery attestor consumes the query's JSON product inPostProductRunType. The default--output tablewrites ANSI-formatted text that the attestor will skip (it filters onapplication/jsonMIME type or.jsonsuffix). Pass--output json(or--output csvand reformat;jsonis the supported shape).sh -c '... > out.json'is a single-shell-redirect. Steampipe writes query output to stdout, not to a file. Wrapping the run insh -croutes that stdout intosteampipe.jsonso theproduct/v0.3attestor can hash it.command-run/v0.1records the fullsh -cargv — that's a tool-output limitation (stdout-only), NOT thecpantipattern.
What gets captured
Each cilock run emits an in-toto envelope whose predicate carries the following attestor types:
| Attestor type | Captures |
|---|---|
https://aflock.ai/attestations/command-run/v0.1 | Real sh -c 'steampipe query ...' argv, env, exit code, stdout/stderr |
https://aflock.ai/attestations/material/v0.3 | Merkle tree of working-directory inputs (.sql files, mod definitions) |
https://aflock.ai/attestations/product/v0.3 | Merkle tree of outputs, including steampipe.json |
https://aflock.ai/attestations/steampipe/v0.1 | Parsed query rows, resultHash (SHA-256 of raw JSON), frontmatter (id, KSI, NIST, plugin, severity), per-row subjects |
https://aflock.ai/attestations/environment/v0.1 | OS, arch, user, env vars (PII-filtered) |
https://aflock.ai/attestations/git/v0.1 | Commit SHA, branch, remotes |
The steampipe/v0.1 predicate's results[].resultHash matches the SHA-256 of the steampipe.json leaf in the product/v0.3 tree. That is the chain that makes the rows verifiable — you can't swap in different JSON without invalidating the product tree. Per-row subjects (aws:account:<id>, aws:arn:<arn>, aws:region:<region> for the AWS plugin; github:repo:<owner/name>, github:org:<login> for GitHub; k8s:uid:<uid>, k8s:namespace:<ns> for Kubernetes; okta:user:<id>, okta:org:<organization> for Okta) are surfaced via Subjects() so policy graphs can join across tools.
Why this shape
| Antipattern | This page |
|---|---|
cilock run ... -- bash -c "cp /tmp/steampipe.json out.json" | cilock run ... -- sh -c 'steampipe query ... > steampipe.json' |
command-run records bash -c "cp ..." — useless | command-run records the real steampipe query argv inside sh -c |
Product attestor digests the cp destination | Product attestor digests Steampipe's actual stdout-redirect target |
| Tool execution happens outside the attestation | Steampipe runs as CI/lock's grandchild via sh -c; spy still traces |
Steampipe writes its query results to stdout, not to a file the binary owns. That's an upstream limitation — --output json controls the format but there's no --output-file option. A single sh -c '... > out.json' is the minimum-glue redirect that gets the bytes into a file the product/v0.3 Merkle tree can hash. command-run/v0.1 records the full ["sh", "-c", "steampipe query --output json ... > steampipe.json"] argv, so the recipe is auditable end-to-end — no shell logic, no string concatenation, no cp laundering.
Validate it locally
# Generate a signing key (one-time).
openssl genpkey -algorithm ed25519 -out key.pem
# Install the Steampipe plugin for the cloud you want to query, and configure credentials.
steampipe plugin install aws
export AWS_PROFILE=your-profile # or rely on instance role / SSO
# Run cilock + Steampipe against any cloud API.
cilock run --step steampipe-query \
--signer-file-key-path key.pem \
--outfile attestation.json \
--attestations steampipe,environment,git \
--enable-archivista=false \
-- sh -c 'steampipe query --output json "select arn, name from aws_iam_user limit 5" > steampipe.json'
# Confirm the predicate carries the expected attestor types.
jq -r '.payload' attestation.json | base64 -d \
| jq '.predicate.attestations | map(.type)'
Expected output:
[
"https://aflock.ai/attestations/environment/v0.1",
"https://aflock.ai/attestations/git/v0.1",
"https://aflock.ai/attestations/material/v0.3",
"https://aflock.ai/attestations/command-run/v0.1",
"https://aflock.ai/attestations/product/v0.3",
"https://aflock.ai/attestations/steampipe/v0.1"
]
# Confirm the real argv (including the steampipe query string) landed in command-run.
jq -r '.payload' attestation.json | base64 -d \
| jq '.predicate.attestations[]
| select(.type=="https://aflock.ai/attestations/command-run/v0.1")
| .attestation.cmd'
Expected output (literal argv — proof the cp antipattern is absent):
[
"sh",
"-c",
"steampipe query --output json \"select arn, name from aws_iam_user limit 5\" > steampipe.json"
]
# Inspect the per-query results: row count, result hash, frontmatter, subjects.
jq -r '.payload' attestation.json | base64 -d \
| jq '.predicate.attestations[]
| select(.type=="https://aflock.ai/attestations/steampipe/v0.1")
| .attestation
| {results: (.results | map({frontmatterId: .frontmatter.id,
plugin: .frontmatter.plugin,
rowCount, resultHash}))}'
Against select arn, name from aws_iam_user limit 5 you should see a single QueryResult with rowCount: 5 (or fewer, if the account has fewer IAM users), plugin: "aws", and a resultHash matching steampipe.json's digest in the product/v0.3 tree.
Notes
- Mods for compliance benchmarks. Beyond ad-hoc queries, Steampipe ships compliance mods that map to named regimes:
steampipe-mod-aws-compliance(FedRAMP, NIST 800-53, HIPAA, PCI),steampipe-mod-aws-cis(CIS AWS Foundations v1.2 – v3.0),steampipe-mod-kubernetes-compliance,steampipe-mod-gcp-compliance,steampipe-mod-azure-compliance, and several others in Turbot's mod gallery. Each mod is a Git repo of.sqlqueries +benchmark.spdefinitions; you can run a whole benchmark (steampipe check benchmark.cis_v300— produces a posture report) or extract individual controls as queries. - Query vs check vs dashboard. Steampipe has three execution modes.
steampipe queryruns ad-hoc SQL and emits rows.steampipe checkruns a benchmark from a mod and emits per-control pass/fail (also exportable as JSON).steampipe dashboardserves an interactive web UI (not used under CI/lock; the attestor consumes JSON files, not a live server). The validated invocation usesquerybecause it's the simplest path to a single signed JSON product. - Plugins drive subjects. The
steampipe/v0.1attestor only emits subjects for plugins in its convention table —aws,github,okta,kubernetestoday. Queries against other plugins (gcp,azure,slack,okta,datadog, …) still produce a signed attestation with rows andresultHash, but the per-row subject fan-out won't include identity axes the table doesn't know about. Extendconventions.goin rookery if you need new ones. - Output formats. Steampipe supports
--output json,--output csv,--output table,--output line. Onlyjsonflows into thesteampipeattestor — the others are filtered out (MIME type /.jsonsuffix check). Stick to JSON for signed evidence; use CSV / table for human-readable runs. - Plugin + credentials prerequisite.
steampipe plugin install aws(orgcp,azure,kubernetes,github,slack,okta, …) must run before the query, and the plugin needs cloud credentials configured —AWS_PROFILE/AWS_ACCESS_KEY_IDin env for AWS,GOOGLE_APPLICATION_CREDENTIALSfor GCP,KUBECONFIGfor Kubernetes,GITHUB_TOKENfor GitHub. Theenvironmentattestor will record the env vars present at run time (PII-filtered), but credentials themselves are not leaked into the envelope. - Frontmatter routing. Real-world recipes drive Steampipe from
.sqlfiles with a YAML frontmatter block (id,task,ksis,nist,plugin,severity). The recipe driver parses the frontmatter and stamps it onto the attestor viaWithFrontmatter/WithSQL; the resultingsteampipe/v0.1predicate carries those fields, and a Rego gate can route onfrontmatter.ksis(e.g.KSI-IAM-MFA) to match envelopes to FedRAMP 20x indicators. Ad-hocsteampipe query(no.sqlfile) getsfrontmatter.id = "anonymous".
FAQ
Does CI/lock support Steampipe?
Yes. CI/lock invokes the upstream steampipe binary unchanged, redirects its JSON output into a file, and the built-in steampipe rookery attestor parses that JSON in the PostProduct phase. The predicate type is https://aflock.ai/attestations/steampipe/v0.1 — distinct from sarif/v0.1, because Steampipe's row-oriented data shape doesn't fit SARIF's finding model. No Steampipe fork, no plugin install on the CI/lock side.
Which clouds can Steampipe scan under CI/lock?
Anything Steampipe has a plugin for — AWS, GCP, Azure, Kubernetes, GitHub, GitLab, Slack, Okta, Datadog, Snowflake, Salesforce, Jira, plus 140+ other providers in Steampipe's hub. The steampipe attestor's subject fan-out today covers AWS, GitHub, Okta, and Kubernetes identity axes; queries against other plugins still produce a signed envelope with rows and a resultHash, just without per-row subject digests until the rookery conventions.go table is extended.
How is Steampipe different from Prowler?
Prowler is a fixed CLI that runs a baked-in catalog of AWS / Azure / GCP / Kubernetes security checks and emits OCSF or ASFF findings. Steampipe is a SQL engine — you write the query (or pull one from a mod). Prowler's findings have severity, status, and remediation guidance built in; Steampipe gives you raw rows and lets the policy layer decide what's pass / fail. Use Prowler when you want an off-the-shelf compliance scan; use Steampipe when you want to define your own controls in SQL or join across providers (e.g. "every GitHub repo whose CODEOWNERS lists an Okta-disabled user").
Can I use custom SQL queries as compliance gates?
Yes — that's the primary CI/lock + Steampipe pattern. Author a .sql file with a YAML frontmatter block (id: my-control, ksis: [KSI-IAM-MFA], nist: [IA-2(1)], plugin: aws, severity: high), run it under CI/lock, and the steampipe/v0.1 predicate carries both the rows and the frontmatter. A Rego policy on the verify side reads attestation.frontmatter.ksis to route the envelope and attestation.results[].rows to gate on row contents (e.g. "no row where mfa_enabled = false").
What if the query returns no rows?
That's a valid attestation. steampipe/v0.1 records rowCount: 0, resultHash over the empty [] JSON array, and emits no per-row subjects (because there are no rows to fan out). A Rego gate can treat "zero rows matched the bad-state query" as the pass condition — which is the natural shape for negative-finding controls (e.g. "select * from aws_iam_user where mfa_enabled = false" → zero rows means the control passes).
See also
steampipeattestor — the underlying ingestion path, predicate schema, and convention table- Validated example:
31-steampipe— the end-to-end reproduce script - Steampipe — upstream project (Turbot)
- Steampipe Hub: mods — compliance benchmark catalogs (
aws_compliance,aws_cis,kubernetes_compliance, …) - Steampipe Hub: plugins — 140+ cloud / SaaS data sources
- Tools index