5 things healthcare engineering teams get wrong about HIPAA CI/CD

I've spent the last six years building cloud infrastructure and CI/CD pipelines for healthcare and defense engineering teams. The same five mistakes keep showing up across every HIPAA engagement I take on, and none of them are about not knowing what HIPAA requires.

Engineers in healthcare aren't dumb. They've read 45 CFR § 164. They know what audit logs are. They've sat through compliance training that lasted longer than their last on-call rotation.

The problem is structural. Most CI/CD pipelines were designed for unregulated software, then bolted with compliance controls afterward. The result is pipelines that satisfy neither engineers nor auditors. Slow, brittle, and somehow still failing audits.

Here are the five patterns I see most often, what goes wrong with each, and what actually works.

Mistake 01Treating compliance as a final gate

The most common pattern: your pipeline runs build, test, and deploy as normal. Then somewhere near the end, a "compliance check" stage runs that produces a report. Audit time rolls around and someone has to dig through six months of build artifacts to assemble evidence.

This fails for two reasons.

First, you can't debug what failed. When the auditor asks "show me the security review for build #4,827 from last March," nobody knows where that evidence lives. It's somewhere in CI logs that have probably rolled off retention.

Second, the late-stage gate creates a false sense of security. Engineers learn that compliance is "the thing that happens at the end," so they stop thinking about it during development. Vulnerabilities ship to staging, get caught at the gate, and the pipeline gets blocked. Now you have an angry developer, a bottlenecked release, and a control that's optimized for catching mistakes rather than preventing them.

What works instead: emit compliance evidence as a property of every stage. Build stage produces an SBOM. Test stage produces signed test results. Scan stage produces vulnerability data. Sign stage produces a signature chain. All of it gets pushed to immutable storage with retention locks the moment it's generated.

When the auditor asks for evidence, the answer is a query, not a project.

Mistake 02A single thousand-line `.gitlab-ci.yml`

I've inherited too many pipelines that look like this:

# .gitlab-ci.yml, 1,847 lines
stages:
  - build
  - test
  - scan
  - deploy
  - notify
  - more-things
  - even-more-things

build:dev:
  # 200 lines

build:staging:
  # 200 lines, mostly copy-pasted from above

build:prod:
  # 200 more lines

test:dev:
  # ...

You cannot review this file. You cannot test it. You cannot meaningfully change it without breaking something else. And every time you push to fix a typo, the entire pipeline runs.

For a HIPAA workload, this is doubly bad. Auditors specifically ask whether you can demonstrate that controls are applied consistently across environments. With a monolithic file, the answer is "trust us." With auditors, "trust us" is the wrong answer.

What works instead: parent/child pipeline architecture.

Your parent pipeline handles environment-level concerns: compliance gates, deployment authorization, evidence aggregation. It's stable, rarely changes, and is the system of record for "what happened."

Your child pipelines handle service-level concerns: build, test, scan, deploy. They're triggered by the parent and can evolve independently per service.

In GitLab, this is implemented with trigger: jobs and include: directives. In GitHub Actions, with workflow_call. In Argo CD, with ApplicationSets.

The pattern matters more than the platform. Once you separate "what runs" from "what's allowed to run," your pipeline becomes auditable instead of incomprehensible.

Mistake 03Manual approval as the only meaningful control

You've seen this in every regulated environment: a deployment job that requires a human to click "approve" before production. Sometimes there's even a Slack notification asking three people to thumbs-up before it proceeds.

Auditors love seeing this in pipeline diagrams. Until you ask: what is the human actually approving?

If the answer is "they're confirming the build looks good," that's not a control. That's theater.

A meaningful approval is one backed by something. The signed artifacts checked out clean. The vulnerability scan came back below threshold. The compliance baseline was verified. The deployment target is in the right environment. The approver is providing judgment on top of those things, not validating that the pipeline ran.

If your approver is just looking at a green checkmark and clicking yes, you have a process that looks like a control but doesn't actually gate anything. Auditors who know what they're doing catch this immediately.

What works instead: policy-as-code gates before the human approval. We use Open Policy Agent (OPA) with Rego policies that verify:

Container signature chain validates against trusted keys
SBOM exists and matches the deployed image
Vulnerability scan is < 24 hours old and below threshold
Approver identity matches an authorized list
Target environment matches what was scanned

The human approver only sees the deploy button if all of those pass. Their job is judgment ("should we ship this on a Friday?"), not validation ("is this safe?"). That's automated.

Mistake 04No environment isolation in the CI runners themselves

This one slipped past me on a project a few years ago, and I see it almost everywhere. You're running CI on shared runners. The runners have IAM credentials with access to multiple AWS accounts, including production. A misconfigured .gitlab-ci.yml from a developer branch could, in theory, deploy to production.

Worse: the runner itself becomes a giant attack surface. If anyone compromises a CI job (through a malicious dependency, a typosquatted image, anything), they have whatever access the runner has.

For a HIPAA workload, this is catastrophic. Your runner now has access to PHI-bearing environments, and your audit logs don't show that as a privileged access path because the runner is "just CI."

What works instead: runner isolation at the network and IAM level.

Production deployments run on dedicated runners in a network that can only reach production. Dev branches can't trigger them.
Runner IAM roles are scoped per-environment. The dev runner can deploy to dev. The prod runner can deploy to prod. Neither can deploy to the other.
Runner images are signed and version-pinned. We don't pull gitlab-runner:latest.
Runners themselves are HIPAA-aligned: encrypted at rest, audit-logged, ephemeral where possible.

The pattern auditors want to see is: "the deployment to production went through this specific runner, with this specific identity, at this specific time, signed by this specific key." If you can't tell that story, your pipeline is the audit finding.

Mistake 05Audit evidence stored alongside the code

Here's the one I see catch teams off guard the most. They've done everything else right. Pipeline emits evidence, signed artifacts, policy gates, the works. Then the auditor asks: where does the evidence live?

"In our Git repo. We commit pipeline logs and scan results."

This fails immediately. The control says "the evidence must be tamper-evident." If engineers have write access to the repository (and they do, that's the whole point of the repo), then engineers have write access to the evidence. The control is broken.

I had this exact conversation with a healthcare team a couple of years back. Smart engineers, modern pipeline, decent security posture. They'd been storing artifacts and audit logs in the same repo as their Terraform code. Worse: the team had GCP service account credentials committed to the Terraform code base. Not by malice. Just because someone had been moving fast on a Friday and pushed credentials to make a CI test work.

The credentials were in the repo. The audit logs were in the repo. The Terraform that provisioned the audit logging bucket was in the repo. None of that survived an honest auditor's first questions.

What works instead: evidence storage is its own thing, separate from CI infrastructure entirely.

Object storage with versioning + retention locks (S3 Object Lock, GCS retention policies, Azure Blob immutable storage). Once written, can't be modified or deleted within the retention window.
Write-only access from CI. The pipeline can push evidence. Nobody (including the people who run the pipeline) can modify or delete it.
Read access is logged separately. Auditors can read; reads are themselves audit-logged.
Encryption keys are managed in a different account or project than where the pipeline runs.

The control to demonstrate is: "even if every engineer on the team colluded, they couldn't tamper with the audit trail." That's what the auditor wants to verify. If you can't structurally guarantee it, you don't have a control. You have a hope.

If you have credentials in your Terraform code base, rotate them today, then move them into a secrets manager (GCP Secret Manager, AWS Secrets Manager, HashiCorp Vault) referenced by Terraform but never stored in plaintext. Treat any committed credential as compromised regardless of whether the repo is private.

PatternThe principle under all five

If you read these carefully, the same principle keeps showing up: the controls have to be properties of the system, not activities humans remember to perform.

Human-dependent controls fail because humans forget, get busy, get tired, or move on to other teams. System-dependent controls fail only when someone changes the system, and changing the system is itself an audit-loggable event.

Healthcare engineering teams that ship well in regulated environments aren't the ones with the most paperwork. They're the ones whose pipeline architecture makes compliance violations structurally difficult.

That's what auditors actually want to see, and it's also what lets your engineering team ship without fighting compliance the whole time.

About the author

Lucas Jones

Founder and Principal Platform Engineer at Stonebridge Tech Solutions. Six years building cloud infrastructure and CI/CD pipelines in regulated environments, including HIPAA, FedRAMP, and SOC 2 work for healthcare and defense engineering teams.

See how we engage on HIPAA CI/CD work →