How to build a HIPAA-compliant CI/CD pipeline

Most HIPAA CI/CD content describes the controls. This one describes the architecture.

A healthcare engineering team I worked with had six weeks to make their CI/CD pipeline audit-ready. They had GitLab. They had AWS. They had a smart team that had already read 45 CFR § 164. They knew what HIPAA required.

They were stuck anyway. Every guide they could find described which controls HIPAA mandates and which scanners to run. None of them described how to actually build the pipeline that emits the controls.

Three architectural decisions separate HIPAA-compliant pipelines from generic CI/CD: parent/child pipeline separation, isolated runners per environment, and security scanners as policy gates rather than advisory output. Everything else flows from these three.

If you've already read the five patterns that fail HIPAA audits, this is the implementation side of the same coin. The earlier post was about what goes wrong. This one is about what to build instead. The code examples assume GitLab CI/CD as the primary platform. The patterns port cleanly to GitHub Actions and Argo CD; I'll call out the translations where they matter. The cloud examples cover GCP and AWS specifically.

Section 01What HIPAA actually requires from a pipeline

HIPAA's Security Rule doesn't specify pipelines. It specifies safeguards. A correctly-built CI/CD pipeline is one of the most efficient places to satisfy those safeguards continuously, instead of producing them as quarterly evidence runs before audit windows.

Five controls touch the pipeline most directly:

§ 164.308(a)(5)(ii)(C) Log-in monitoring. Every deployment must be attributable to an authenticated identity with audited access. In pipeline terms: who triggered this deploy, with what credentials, and is that identity authorized for this environment?
§ 164.308(a)(8) Periodic evaluation. Security evaluations must happen on every change, not on a schedule. Vulnerability scans, dependency checks, and policy evaluations run on every push.
§ 164.312(b) Audit controls. The pipeline records who deployed what, when, against which approval chain. Every deploy is queryable months later without forensic reconstruction.
§ 164.312(c)(1) Integrity controls. Artifacts are signed, signatures are verified before deployment, and tampering is structurally detectable. The artifact you deploy is provably the one that passed scanning.
§ 164.312(e)(1) Transmission security. Deployments to PHI-bearing environments use mutually-authenticated, encrypted channels. No deploys over unencrypted protocols, no bearer-token-only authentication to production.

None of these are exotic. All of them are routinely missed in pipelines built without the controls in mind from day one. The framing that helps most: HIPAA doesn't tell you how to build a pipeline. It tells you what evidence the pipeline must produce. Once you accept that, the architecture follows.

The pillar page covers the full control mapping in more detail. The five above are enough to anchor every architectural decision that follows.

Section 02Three architectural decisions

Most HIPAA pipeline guides are checklists. Run SAST. Run container scanning. Sign your artifacts. Encrypt your transmissions. Store your audit logs. The checklists are correct, and almost completely useless.

They're useless because they describe outputs without describing the architecture that produces them. A team can implement every item on the checklist and still build a pipeline that fails audit. I've seen it happen four times.

Three decisions, made early, determine whether a HIPAA pipeline actually works. The next three sections walk through each one: what goes wrong without it, what the architecture looks like, and what the code looks like.

Decision 01Parent/child pipeline separation

Most HIPAA pipelines start as a single .gitlab-ci.yml file. It works fine for the first three months. By month six it's 800 lines. By month twelve it's 1,500 lines and nobody touches it without flinching.

The problem isn't just maintainability. It's auditability. A 1,500-line pipeline file mixes environment-level concerns (production approval gates, evidence aggregation, signing) with service-level concerns (unit tests, container builds, lint checks). The auditor's question, "show me that production deploys are gated", requires reading the entire file to answer. And every YAML change risks dropping a compliance gate that nobody noticed was load-bearing.

The fix is structural separation. A parent pipeline owns environment-level concerns: compliance gates, approvals, evidence aggregation, deployment authorization. It changes rarely and is the system of record for "what happened." Child pipelines own service-level concerns: build, test, scan, container packaging. They evolve independently per service.

The pattern is platform-independent. In GitLab CI/CD it's implemented with trigger: jobs and downstream artifact propagation. In GitHub Actions, with reusable workflows (workflow_call). In Argo CD, with ApplicationSets and progressive delivery patterns. The platform changes; the architecture doesn't.

A simplified parent pipeline looks like this:

# .gitlab-ci.yml (parent)
# Owns: compliance gates, evidence, deploy authorization

stages:
  - authorize
  - build
  - aggregate-evidence
  - policy-gate
  - deploy

variables:
  HIPAA_ENVIRONMENT: ${CI_COMMIT_BRANCH}
  EVIDENCE_BUCKET: "gs://hipaa-evidence-${ENV}"

authorize:
  stage: authorize
  script:
    - ./scripts/verify-identity.sh "$GITLAB_USER_ID" "$HIPAA_ENVIRONMENT"
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'

trigger-build:
  stage: build
  trigger:
    include: .gitlab/child-build.yml
    strategy: depend
  variables:
    PARENT_PIPELINE_ID: $CI_PIPELINE_ID

aggregate-evidence:
  stage: aggregate-evidence
  script:
    - ./scripts/collect-evidence.sh "$PARENT_PIPELINE_ID"
    - gsutil cp evidence-bundle.json "$EVIDENCE_BUCKET/$CI_PIPELINE_ID/"
  needs: ["trigger-build"]

policy-gate:
  stage: policy-gate
  image: openpolicyagent/opa:latest
  script:
    - opa eval -d policies/ -i evidence-bundle.json "data.deploy.hipaa.allow"
  needs: ["aggregate-evidence"]

deploy-production:
  stage: deploy
  tags: ["hipaa-prod-runner"]
  environment: production
  when: manual
  script:
    - ./scripts/deploy-signed.sh
  needs: ["policy-gate"]

That's the entire parent pipeline: under 40 lines, every stage doing one thing, every job tied to an explicit compliance concern. The child pipeline handles build/test/scan in a separate file that the service team owns. The parent owns the gates; the child owns the code.

More on the three architecture principles we apply on every HIPAA pipeline build →

Decision 02Isolated runners per environment

The most underappreciated HIPAA control isn't in the pipeline file. It's in the runner.

Most teams use shared GitLab or GitHub Actions runners with IAM credentials broad enough to reach multiple environments. A misconfigured .gitlab-ci.yml from a dev branch can deploy to production, because the runner has the credentials and nothing stops the YAML from using them. Worse: the runner itself becomes an attack surface. A compromised runner with production IAM access reaches PHI directly, and your audit logs don't show that as a privileged access path because the runner is "just CI."

Auditors want a specific story: this deployment to production went through this specific runner, with this specific identity, at this specific time, signed by this specific key. If you can't tell that story crisply, the runner is your audit finding.

Four patterns isolate runners properly:

Dedicated runners per environment. Production deploys run on production-only runners. Dev branches cannot trigger production runners through any path.
Scoped IAM per runner. Dev runner has dev IAM; prod runner has prod IAM. Neither can deploy to the other. The pipeline file cannot lift its own privileges.
Signed, version-pinned runner images. No gitlab-runner:latest. Image signature verified before runner starts.
HIPAA-aligned runner infrastructure. Encrypted at rest, audit-logged, ephemeral where possible, isolated network egress.

On GCP, the cleanest implementation is a dedicated GKE node pool per environment, with Workload Identity binding runner service accounts to environment-scoped IAM roles:

# Terraform: GCP HIPAA runner infrastructure

resource "google_container_node_pool" "hipaa_prod_runners" {
  name       = "hipaa-prod-runners"
  cluster    = google_container_cluster.hipaa.id
  node_count = 2

  node_config {
    machine_type = "n2-standard-4"
    image_type   = "COS_CONTAINERD"

    # Workload Identity: runner SA cannot escape its scope
    workload_metadata_config {
      mode = "GKE_METADATA"
    }

    service_account = google_service_account.runner_prod.email
    oauth_scopes    = ["https://www.googleapis.com/auth/cloud-platform"]

    # Encrypted boot disk
    disk_size_gb = 100
    disk_type    = "pd-ssd"

    # Network isolation: only egress to allowed targets
    tags = ["hipaa-prod-runner"]
  }

  # Auto-upgrade for CIS-aligned base images
  management {
    auto_upgrade = true
    auto_repair  = true
  }
}

resource "google_service_account" "runner_prod" {
  account_id   = "hipaa-prod-runner"
  display_name = "HIPAA Production CI Runner"
}

# Runner SA can deploy to prod GKE only — not dev, not staging
resource "google_project_iam_member" "runner_prod_deploy" {
  project = var.project_id
  role    = "roles/container.developer"
  member  = "serviceAccount:${google_service_account.runner_prod.email}"

  condition {
    title       = "Production cluster only"
    expression  = "resource.name.startsWith('projects/${var.project_id}/zones/${var.zone}/clusters/hipaa-prod')"
  }
}

The IAM condition is the load-bearing part. Even if a developer writes a pipeline that tries to deploy to staging from a production runner, the IAM denies the action at the GCP API layer. The pipeline file cannot grant itself privileges the runner doesn't have.

The 5 mistakes post covers why pipeline-runner isolation matters in more depth. The TL;DR: shared runners with broad IAM are the most common HIPAA pipeline finding I see, and the cheapest to fix.

Decision 03Security scanners as policy gates

Almost every HIPAA pipeline guide includes security scanners. SAST, DAST, container CVE scanning, dependency checking, IaC scanning. The tools are right. The configuration is usually wrong.

Most teams run scanners as advisory: scans execute, results are saved to a dashboard, the deploy proceeds regardless. Auditor asks: "what happens when a critical vulnerability is found?" The honest answer is usually "the developer gets notified and decides what to do." That answer fails audit.

The right answer: the deploy is blocked by policy. The human approver only sees the deploy button if every scanner produced evidence that passed threshold. Scanners produce inputs to a policy engine. The policy engine decides whether the deploy proceeds. Humans provide judgment on top of that decision, not validation that the pipeline ran.

Five scanners earn their place in a HIPAA pipeline:

SAST. Semgrep with custom rules for healthcare-specific patterns. CodeQL is a strong alternative if you're already on GitHub Advanced Security.
Container CVE scanning. Trivy. Grype is acceptable. Don't rely on cloud-provider-only scanners (ECR scan, Artifact Registry scan) as your only line.
IaC scanning. tfsec plus Checkov. They catch different things; run both.
Secret scanning. Gitleaks in pre-commit and in CI. TruffleHog also works.
Dependency scanning. OSV-Scanner for transitive vulnerabilities. Dependabot for routine updates.

The architecture is the same regardless of which tools you pick. Scanner runs, produces structured output (JSON or SARIF), evidence is signed and stored, policy engine evaluates evidence, gate opens or closes:

# policies/hipaa_deploy.rego
# OPA policy gate for HIPAA production deploys

package deploy.hipaa

default allow = false

# Deploy allowed if all four conditions hold
allow {
    scan_evidence_valid
    signature_valid
    approver_authorized
    target_environment_matches
}

# All scanners produced evidence in last 24 hours
scan_evidence_valid {
    input.scans.container.timestamp > time.now_ns() - (24 * 60 * 60 * 1e9)
    input.scans.sast.timestamp     > time.now_ns() - (24 * 60 * 60 * 1e9)
    input.scans.iac.timestamp       > time.now_ns() - (24 * 60 * 60 * 1e9)
}

# Zero critical findings across all scanners
scan_evidence_valid {
    input.scans.container.critical == 0
    input.scans.sast.critical      == 0
    input.scans.iac.critical       == 0
    input.scans.secrets.findings   == 0
}

# Artifact signature verified by Cosign
signature_valid {
    input.artifact.cosign_verified == true
    input.artifact.signed_by       == input.expected_signer
}

# Approver is on the authorized list for this environment
approver_authorized {
    authorized := data.approvers[input.target_environment]
    input.approver_id == authorized[_]
}

# Deploy target matches the artifact's intended environment
target_environment_matches {
    input.artifact.target_env == input.target_environment
}

Every condition in that policy is checkable, auditable, and structurally enforced. When the auditor asks "how do you guarantee scans pass before deploy?", the answer is a 35-line Rego file you can hand them.

Section 03Putting it together: the reference architecture.

The three decisions compose into a six-stage pipeline. Each stage emits evidence. Each evidence artifact is signed and stored separately from code. The policy gate evaluates the full evidence bundle before any deploy proceeds.

Figure 1 · HIPAA pipeline reference architecture

The diagram makes one thing clear that the prose can hide: evidence flows in one direction only. Child pipelines write evidence to the bucket; nothing reads from it except the policy gate. The bucket is write-once from CI's perspective, and read-only from the gate's perspective. Engineers cannot tamper with evidence after the fact, because the bucket's IAM policy doesn't allow CI to modify or delete existing objects.

This is what auditors mean by "audit controls." Not a logfile somewhere. A separate, immutable, queryable record of everything the pipeline did, signed by the pipeline's identity, retained per your compliance policy.

Adjacent frameworks · CMMC and FedRAMP

The same architecture satisfies most of the technical controls in CMMC 2.0 (Levels 2 and 3) and FedRAMP Moderate. The control mappings differ. CMMC inherits NIST 800-171 control families; FedRAMP uses NIST 800-53. The underlying engineering is identical. Parent/child pipelines satisfy CM-3 (Configuration Change Control) and AU-2 (Audit Events). Isolated runners satisfy AC-3 (Access Enforcement) and SC-7 (Boundary Protection). Policy gates satisfy SI-2 (Flaw Remediation) and CA-7 (Continuous Monitoring).

For defense workloads, the only material differences are runner placement (CI runners must operate inside the GovCloud or Azure Government boundary) and KMS configuration (FIPS 140-2 validated). Build the pipeline correctly for HIPAA and the FedRAMP version is mostly a deployment configuration change.

Section 04GCP-specific implementation

On GCP, the HIPAA pipeline architecture maps to a specific set of native services. The translations matter because cloud-specific service quirks determine whether the pattern actually satisfies the underlying control.

Evidence storage. GCS bucket with versioning, retention locks, and Bucket Lock for write-once semantics. Use a separate project for the evidence bucket so its IAM is independently managed.
Signing keys. Cloud KMS with automatic rotation. Cosign integrates natively. Use HSM-backed keys for production signing.
Runner infrastructure. GKE Autopilot or Standard with environment-specific node pools. Workload Identity binds runner pods to GCP service accounts. Per-environment service accounts with IAM conditions enforce boundary.
Artifact storage. Artifact Registry, not the deprecated Container Registry. Turn on vulnerability scanning at the registry layer as a second line.
VPC-SC perimeters. Restrict service-to-service access at the network layer. CI runners outside the perimeter cannot reach PHI-bearing services.
Identity-Aware Proxy. When the pipeline must reach VMs (rare but real, especially for legacy environments), IAP provides the audited, encrypted, identity-aware channel.

A simplified GCP deploy job looks like this:

# .gitlab/child-deploy-gcp.yml
# HIPAA-aligned GKE deploy, GCP-specific

deploy-gcp:
  stage: deploy
  image: google/cloud-sdk:slim
  tags: ["hipaa-prod-runner"]

  before_script:
    # Workload Identity authentication (no static keys)
    - gcloud auth print-access-token > /tmp/token
    - gcloud config set project $GCP_PROJECT_ID

  script:
    # Verify artifact signature before deploy
    - cosign verify
        --key gcpkms://projects/$GCP_PROJECT_ID/locations/$GCP_REGION/keyRings/hipaa/cryptoKeys/signing
        $ARTIFACT_REGISTRY_URL/hipaa-app:$CI_COMMIT_SHA

    # Deploy to GKE with image digest (not tag)
    - kubectl set image deployment/hipaa-app
        hipaa-app=$ARTIFACT_REGISTRY_URL/hipaa-app@sha256:$ARTIFACT_DIGEST

    # Emit deployment evidence
    - ./scripts/emit-evidence.sh deploy-complete $CI_PIPELINE_ID

  environment:
    name: production-gcp
    deployment_tier: production

  rules:
    - if: '$CI_COMMIT_TAG && $CI_COMMIT_TAG =~ /^v\d+\.\d+\.\d+$/'

Three things to notice. First, no JSON service account key is ever written to disk; Workload Identity issues short-lived tokens. Second, the deploy uses the image digest, not the tag, so a race condition between tag-and-deploy can't substitute a different image. Third, evidence emission is part of the deploy job itself; if the deploy succeeds, evidence is written, and the parent pipeline can see it.

Section 05AWS-specific implementation

On AWS the architecture is the same; the services change names. For teams operating in both clouds (increasingly common in healthcare), the parent pipeline can dispatch to either cloud's child pipeline based on a variable. Both clouds emit evidence to a centralized bucket, both clouds use OPA for policy gating, both clouds use Cosign for signing.

Evidence storage. S3 with Object Lock in Compliance Mode, versioning enabled, cross-region replication for redundancy.
Signing keys. AWS KMS with customer-managed keys (CMKs), automatic rotation, CloudTrail logging on every key use.
Runner infrastructure. EKS managed node groups with IRSA (IAM Roles for Service Accounts). Separate node groups per environment with taints and tolerations.
Artifact storage. ECR with image scanning enabled and immutable tags. Lifecycle policies for evidence retention.
Network isolation. Transit Gateway for environment-level isolation. Security groups operate on principles (referencing tags or names) rather than IP allowlists.
GovCloud variant. Same architecture, runners placed inside the GovCloud boundary for FedRAMP-aligned workloads.

IRSA is the AWS analog to GCP's Workload Identity. The Terraform looks similar but the trust policy is what does the work:

# Terraform: AWS HIPAA runner IAM role (IRSA)

resource "aws_iam_role" "hipaa_prod_runner" {
  name = "hipaa-prod-runner"

  assume_role_policy = jsonencode({
    Statement = [{
      Effect = "Allow"
      Principal = {
        Federated = aws_iam_openid_connect_provider.eks.arn
      }
      Action = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringEquals = {
          "${replace(aws_iam_openid_connect_provider.eks.url, "https://", "")}:sub" =
            "system:serviceaccount:gitlab-runner:hipaa-prod-runner"
          # Scope to one specific namespace + one service account
        }
      }
    }]
  })
}

# Scoped policy: deploy to prod cluster only
resource "aws_iam_role_policy" "hipaa_prod_deploy" {
  role = aws_iam_role.hipaa_prod_runner.id

  policy = jsonencode({
    Statement = [{
      Effect = "Allow"
      Action = [
        "eks:DescribeCluster",
        "eks:ListClusters"
      ]
      Resource = aws_eks_cluster.hipaa_prod.arn
      Condition = {
        StringEquals = {
          "aws:RequestedRegion" = "us-east-1"
        }
      }
    }]
  })
}

The IRSA trust policy is the boundary enforcement. The runner pod can only assume this role if it runs in the specific namespace with the specific service account. A dev pipeline cannot bypass it; an attacker who compromises a dev runner cannot pivot to it.

Section 06What this looks like in production

One of our healthcare engagements involved auditing a GCP-based platform running production workloads with PHI in flight. The internal team had been told to be ready for a third-party HIPAA assessment in six weeks. They had GitLab CI/CD, a working pipeline, and a compliance team that knew what HIPAA wanted. What they didn't have was visibility into VM-level inventory or evidence that their pipeline controls were enforced rather than recommended.

We started with the architectural audit: was the pipeline parent/child or monolithic (monolithic, 1,200 lines), were runners isolated per environment (no, one shared runner with broad IAM), were scanners producing policy gate inputs (no, scanners ran as advisory). Three findings, all structural, all addressable.

The remediation was the architecture above. Parent pipeline owns gates; child pipelines own builds. Dedicated runner per environment with scoped IAM. Scanners producing signed evidence into an immutable bucket; OPA policy evaluating before deploy. Six weeks of work, fixed scope, fixed fee.

The team passed the third-party audit on first-party review. More importantly, the pipeline kept passing through subsequent quarterly audits without remediation work, because the architecture made the controls structural instead of procedural.

That's the test of a HIPAA pipeline: does it pass audit when the auditor changes, the team changes, and the code changes? An architecturally correct pipeline does. A checklist-compliant pipeline doesn't.

Section 07Tooling recommendations

Opinionated picks, based on what actually holds up in regulated environments. Substitutions are fine; the architecture matters more than the specific tool.

Stage	Recommended	Acceptable alternative	Avoid
Build	Buildah, Kaniko (rootless)	docker:dind (with caveats)	Privileged Docker on shared runners
SAST	Semgrep with custom rules	CodeQL (if on GitHub)	Cloud-vendor scanners alone
Container scanning	Trivy	Grype	ECR / Artifact Registry scan as only line
IaC scanning	tfsec + Checkov (both)	KICS	Manual review
Secret scanning	Gitleaks (pre-commit + CI)	TruffleHog	Regex-only homebrew checks
Signing	Cosign with KMS-backed keys	Sigstore (notary v2)	Manual signing, local keys
Policy gates	OPA / Rego	Kyverno (k8s-specific)	Manual approval only
Evidence storage	S3 Object Lock / GCS Bucket Lock	Versioned bucket only	Same Git repo as code

Section 08The CI/CD platform question

Three platforms cover roughly 90% of HIPAA CI/CD work I see: GitLab CI/CD, GitHub Actions, and Argo CD. None of them are wrong choices for HIPAA work. Each makes the architecture above easier or harder in specific ways.

GitLab CI/CD is the cleanest fit for parent/child pipelines. Native trigger: jobs, downstream artifact propagation, and self-hosted runners with custom tags make the runner isolation pattern straightforward. The compliance gates ride naturally on the parent pipeline structure.

GitHub Actions can do the same with reusable workflows (workflow_call), but the model is more constrained. Runner isolation requires GitHub Enterprise or self-hosted runners with custom labels. Workable, not as clean.

Argo CD is excellent for the deploy stage. It's not a complete CI tool. Pair it with GitLab CI/CD or GitHub Actions for build/test/scan, then use Argo CD for the actual deploy with GitOps semantics. ApplicationSets are an underappreciated mechanism for environment isolation.

For new HIPAA pipelines, my opinionated default is GitLab CI/CD plus Argo CD. Stronger primitives for parent/child separation, cleaner runner isolation, and Argo CD's progressive delivery patterns reduce blast radius. Don't try to use Jenkins for new HIPAA pipelines.

Section 09Common mistakes to avoid

Five quick callouts from the field. Each one fails audits more often than it should.

Single thousand-line pipeline file. Refactor to parent/child before the file passes 500 lines. Past 1,000 it's an audit finding waiting to surface.
Shared runners with broad IAM. Isolated runners per environment, scoped IAM, no exceptions for "the deploy job."
Scanners running as advisory. Scanners produce evidence. Evidence feeds policy. Policy decides. The human approves with judgment, not by validating the pipeline ran.
Audit evidence in Git. Evidence lives in a separate immutable bucket. The same Git repo as code is the wrong answer.
Manual approval as the only control. A human clicking a button satisfies nothing on its own. The button only appears when the policy gate passes.

For longer-form examples of these failure modes, the earlier post on five patterns that fail HIPAA audits walks through each with healthcare engagement examples.

Section 10Conclusion

The team that ships well in regulated environments isn't the one with the most paperwork. It's the one whose architecture makes compliance violations structurally difficult.

"Audit-ready" isn't a state you achieve. It's a property of how the pipeline operates. Parent/child separation isolates compliance from delivery so both can evolve. Isolated runners per environment stop pipeline files from lifting their own privileges. Security scanners as policy gates turn checklists into enforcement.

Build the architecture correctly and audits become queries. Build it incorrectly and audits become fire drills, no matter how many tools you bolt on.

If you're working through this at your team: Stonebridge runs two-week HIPAA CI/CD audits that map your existing pipeline against the Security Rule and produce a written remediation roadmap. Fixed fee, founder-led, the report holds up under first-party review.

About the author

Lucas Jones

Founder and Principal Platform Engineer at Stonebridge Tech Solutions. Six years building cloud infrastructure and CI/CD pipelines in regulated environments, including HIPAA, FedRAMP, and SOC 2 work for healthcare and defense engineering teams.

See how we engage on HIPAA CI/CD work →

How to build a HIPAA-compliant CI/CD pipeline: a 2026 implementation guide.