A federal customer was ready to procure. The architecture was not.
The client is an AI SaaS vendor running production on commercial AWS with a strong engineering culture, modern Terraform practice, and zero prior federal experience. A federal customer had committed to procurement contingent on a FedRAMP Moderate path, with an aggressive timeline tied to a fiscal year.
The internal team understood the application deeply and had read enough FedRAMP material to know they were in trouble. The architecture decisions that worked beautifully for commercial customers (shared accounts, hosted vector store, OpenAI behind the application, observability stack outside the cloud) would each fail boundary review. The remediation list grew faster than the engineering team could keep up with, and the timeline did not move.
They reached out for boundary architecture help: not policy writing, not 3PAO selection. Engineering work to redesign the cloud footprint so that the boundary could be drawn cleanly and the assessment could proceed.
Draw the boundary first. Then write Terraform.
The most common FedRAMP failure pattern is to start with the existing environment and try to bend it to fit the boundary. We do not do that. The first deliverable was a boundary diagram drawn from scratch, before any IaC was touched, that identified every service inside, outside, and at the edge of the authorization boundary.
From the boundary, every other architectural decision derived. The Terraform module library was written against the boundary, not the existing accounts. Identity federation, network architecture, KMS topology, and logging followed.
Key decisions
- Account TopologyFour-account model in AWS GovCloud: production, staging, logging, and shared services. Cross-account access exclusively via PrivateLink with mutual authentication. No shared accounts with the commercial environment.
- IdentityIAM Identity Center federating from the existing corporate IdP (source outside the boundary). Time-bounded role assumption, MFA enforced via FIPS-validated authenticator, no long-lived access keys anywhere in the boundary.
- Model EndpointsInference served exclusively via AWS Bedrock under the GovCloud BAA path. No commercial model endpoints inside the boundary. RAG vector store inside the boundary on a managed pgvector deployment.
- LoggingCentralized logging account with S3 Object Lock retention. Engineers in production and staging have zero write access. SIEM integration is read-only across PrivateLink.
- KMSCustomer-managed keys with automatic annual rotation. Key usage logged to the centralized logging account. Separate keys for data classifications; key policy denies cross-classification use.
- PipelineGitHub Actions runners running inside the boundary, no commercial GitHub Actions hosted runners touching boundary state. Signed container images, OPA admission control on the EKS clusters.
- DocumentationEvery Terraform module shipped with a control narrative file mapping the module's resources to the NIST 800-53 controls it satisfies. The SSP is generated from those narratives, not written separately.
A boundary the team could ship into, not just defend.
The engagement delivered the full set of artifacts a 3PAO would expect to see during readiness review, plus the engineering primitives the client team needed to keep operating inside the boundary after we left.
Terraform module library
A module library covering account scaffolding, IAM federation, KMS topology, EKS cluster baseline, RDS with regulated defaults, Bedrock endpoint configuration, and the centralized logging account. Every module ships a control narrative file. Policy gates (OPA + Sentinel) reject non-regulated services, non-FIPS endpoints, and unsigned container images at plan time.
Identity and access
IAM Identity Center as the only path into the boundary. Role catalog with time-bounded assumption windows. Break-glass procedure documented and rehearsed. Access reviews automated quarterly with output flowing to the logging account.
Logging architecture
Logging account isolated to write-only from production and staging. S3 Object Lock retention configured for 7 years. SIEM (Splunk) integration via PrivateLink, read-only. CloudTrail organization trail, VPC Flow Logs, EKS audit logs, KMS key usage all flowing in.
Pipeline
GitHub Actions self-hosted runners inside the boundary on isolated EKS node groups. Container images built and signed inside the boundary. Bedrock-backed inference workloads deployed via GitOps (Argo CD) with admission control rejecting unsigned images and non-baseline configurations.
Control documentation
SSP-ready control narratives generated from the Terraform module library. Continuous monitoring plan documented. Incident response runbooks rehearsed with the client engineering team. 3PAO readiness pack delivered with the boundary diagram, data flow diagrams, and inherited-service mapping.
The boundary review went on schedule. The procurement is live.
FedRAMP Moderate boundary, ready for 3PAO assessment.
on first-party assessment
matched customer deadline
after boundary lock
The client passed first-party 3PAO readiness review on schedule. The authorization boundary in the SSP matches the architecture diagrams; the architecture diagrams match the Terraform; the Terraform produces the resources the auditor will examine. There is no gap between document and reality, which is the single most common cause of 3PAO findings.
Procurement with the original federal customer is proceeding. The client engineering team is operating the boundary themselves, with Stonebridge available for continuing compliance support on a managed retainer basis.
Three decisions that compounded over the engagement.
Drawing the boundary before writing Terraform
Every subsequent architectural decision derived from the boundary diagram. When a question arose ("should this service be inside?") the answer was already documented. There was no negotiation mid-engagement about whether a service was in scope.
Control narratives shipped alongside modules
Writing the control narrative when the Terraform module is written is the highest-leverage discipline in FedRAMP engineering. The narrative becomes a property of the module; the SSP is generated; the auditor's questions are answered by the same source of truth that produced the resources.
Policy gates rejecting non-regulated patterns
OPA and Sentinel policy gates at plan time prevented the most common drift pattern: an engineer reaching for a familiar commercial-region service that is not FedRAMP-authorized. The gate is the boundary discipline at the speed of `terraform plan`.