Accelerating Classified Software Delivery on EKS in AWS GovCloud

Accelerating Classified Software Delivery on EKS in AWS GovCloud

Colorado Springs, CO

Executive Summary

Client Overview

THNKBIG and a premier U.S. defense contractor to collaborated to revolutionize their delivery of mission-critical software capabilities to multiple Department of Defense (DoD) agencies. This collaboration focused on overcoming the unique challenges of deploying classified systems in highly secure cloud environments while maintaining the agility needed for modern warfare requirements.

The engagement centered on architecting a solution that met stringent DoD Impact Level 5 (IL-5) compliance standards in AWS GovCloud (US-East and US-West), requiring complete isolation from public internet connectivity. Together, we designed and implemented an innovative approach that balanced uncompromising security with the need for rapid feature deployment - ensuring warfighters received cutting-edge capabilities without compromising the integrity of classified systems.

Through this partnership, we established a new paradigm for secure software delivery that transformed the contractor's ability to:

  • Maintain absolute security in air-gapped environments
  • Accelerate development cycles for classified systems
  • Automate compliance processes without sacrificing rigor
  • Enable continuous delivery of mission-critical features

94%
Fewer Critical Vulnerabilities at Release
90%
Faster IL-5 Environment Provisioning
50%
Faster ATO Approvals (6 Months → <3 Months)

Solution Implemented

  • Air-Gapped Rancher Landing Zone
    • Automated Terraform + Ansible AWX deployment of RKE2 clusters in isolated VPCs.
    • ECR Private for secure, nightly mirroring of approved container images.
    • VPC Endpoints only—zero public internet exposure.
  • GitLab-Centric GitOps Pipeline
    • Merge requests triggered immutable Helm releases via Rancher Fleet.
    • SBOM generation (cosign-signed) and storage in Harbor for full provenance.
    • Automated STIG checks embedded in CI/CD, failing non-compliant builds.
  • NeuVector Zero-Trust Runtime Security
    • Layer-7 segmentation blocked 97% of unnecessary east-west traffic.
    • Deep packet inspection (DPI) baseline established in 48 hours.
    • Continuous runtime monitoring using for container escape attempts (0 successful breaches).
  • Automated Compliance & Continuous ATO
    • InSpec, OpenSCAP, and Trivy scans auto-uploaded to eMASS.
    • Auto-generated SSP, SAR, and POA&M docs (349 NIST 800-53 controls mapped).
    • Non-compliant findings fail pipelines, preventing vulnerable deployments.

Outcomes Expected

  • 90% Faster Provisioning
    • IL-5 environments spun up in 5 days (vs. 8 weeks manually).
    • 65% reduction in engineer hours (400h → 140h per release).
  • Dramatic Security Improvements
    • 94% fewer critical CVEs (200+ → ≤12 per release).
    • 98% STIG compliance rate (up from 60%).
  • Accelerated Compliance & ATO
    • ATO cycles cut by >50% (6–8 months → <3 months).
    • Continuous compliance monitoring keeps ATO "evergreen."
  • Mission-Readiness & Scalability
    • Zero downtime during blue-green migration of 112 microservices.
    • Weekly secure releases enabled without re-accreditation delays.
    • Future-proofed against evolving threats (NIST 800-207 zero-trust alignment).

Challenge

The client faced four major bottlenecks that hindered their ability to deliver secure, compliant software at the speed required by DoD missions.

First, the manual provisioning of IL-5 enclaves took 8 weeks per environment, with STIG hardening consuming 400+ engineer-hours—a significant drag on agility.

Second, security gaps were pervasive: only 60% of containers passed DISA STIG scans, and each release introduced 200+ critical CVEs, exposing mission systems to unacceptable risk.

Third, siloed workflows forced teams to manually compile ATO documentation across five different tools, delaying accreditation by 6–8 months per release. Finally, the air-gapped environment complicated software supply chains, requiring sneakernet transfers of container images and Helm charts, which introduced provenance risks and blind spots in tamper detection.

These challenges were not just operational inefficiencies—they directly impacted mission readiness. Slow deployments meant warfighters waited months for critical updates, while inconsistent security postures left systems vulnerable to exploitation. The manual compliance processes were unsustainable, creating ATO backlogs that stifled innovation. The lack of automated SBOM tracking and image signing in the air-gapped environment also meant the client could not fully verify the integrity of deployed artifacts, a major concern for IL-5 systems handling classified data.

Solution

The implemented solution automated, hardened, and streamlined the entire software delivery pipeline while maintaining strict IL-5 compliance. The Air-Gapped Rancher Landing Zone used Terraform and Ansible to deploy pre-hardened RKE2 clusters in isolated VPCs, reducing provisioning time from weeks to days. ECR Private ensured secure image replication by mirroring approved containers from a staging enclave, eliminating risky sneakernet transfers. GitLab became the orchestration hub, with pipelines auto-generating signed SBOMs (via cosign) and pushing them to Harbor, ensuring full artifact traceability. Rancher Fleet enabled GitOps-driven Helm deployments, making releases immutable and auditable.

To enforce runtime security, NeuVector was deployed cluster-wide, applying layer-7 segmentation policies that blocked 97% of unnecessary east-west traffic—a major improvement over the previous permissive network model. Automated compliance checks were embedded into every pipeline, with InSpec, OpenSCAP, and Trivy scans feeding directly into eMASS. This allowed the system to auto-generate ATO packages (SSP, SAR, POA&M) and enforce 349 NIST 800-53 controls as code. Non-compliant builds would fail fast, preventing vulnerabilities from reaching production.

Implementation

The rollout followed a phased, risk-mitigated approach, beginning with a 3-week discovery phase to map dependencies and design the Rancher Landing Zone. The 4-week pilot focused on core infrastructure: RKE2 hardening, Traefik/ALB integration, and NeuVector validation. Security testing took 3 weeks, including automated STIG enforcement and red-team container escape tests (which resulted in zero successful breaches). The production migration was remarkably fast—just 5 days per cluster—using a blue-green strategy to move 112 microservices with zero downtime.

The ATO process, traditionally a 6–8 month ordeal, was completed in just 90 days, thanks to automated evidence collection and documentation. Only three POA&M items were identified, all resolved within 30 days. This acceleration was possible because compliance was continuously validated, not manually assembled at the last minute. The phased approach ensured that each component was battle-tested before full deployment, minimizing disruptions to mission operations.

Results &amp; Impact

The improvements were dramatic and measurable. Environment spin-up time dropped by 90% (from 8 weeks to 5 days), enabling rapid scaling for new missions. Critical vulnerabilities plummeted by 94% (from 200+ to ≤12 per release), drastically reducing exploit risks. STIG compliance surged from 60% to 98%, ensuring consistent adherence to DoD security standards. Most importantly, ATO cycles were cut by more than half (from 6–8 months to under 3 months), allowing weekly mission updates without re-accreditation delays.

Beyond metrics, the transformation fundamentally changed how the organization delivered software. Engineers saved 65% of their time per release (400h → 140h), allowing them to focus on innovation rather than compliance paperwork. The automated, zero-trust architecture also future-proofed the system against evolving threats, ensuring long-term compliance with NIST 800-207 (zero-trust guidelines). The shift to continuous ATO meant the platform remained "evergreen," eliminating the traditional "compliance crunch" before audits.

Key Takeaways

This engagement proves that even highly restricted IL-5 environments can achieve DevSecOps agility with the right automation and architecture. Rancher Prime + RKE2 + IaC provides a scalable, repeatable model for air-gapped Kubernetes, while GitLab-driven GitOps ensures every change is tracked, signed, and validated before deployment. NeuVector’s zero-trust enforcement fills a critical gap in container security, going beyond static scans to block runtime threats. Most importantly, compliance automation (InSpec, OpenSCAP, eMASS integration) turns ATO from a yearly burden into a continuous process, keeping systems secure without slowing missions.

The lessons here extend beyond defense contracting—any organization handling sensitive workloads in regulated environments (e.g., healthcare, finance, critical infrastructure) can apply similar principles. Automated SBOMs, immutable deployments, and runtime DPI are becoming industry standards for high-assurance computing. By treating security and compliance as code, teams can move fast without sacrificing rigor, ensuring both mission speed and mission safety. This project sets a new benchmark for secure, air-gapped DevSecOps at scale.

---

**Ready to secure your Kubernetes environment?**

Explore our Kubernetes consulting services →

Learn about Zero Trust compliance →

DoD IL-5 DevSecOps Case Study: From Bottlenecks to Continuous ATO

The Challenge

A DoD program operating at Impact Level 5 (IL-5) needed to deliver secure, compliant software at mission speed but faced four critical bottlenecks:

1. Dangerously Slow Environment Provisioning

  • IL-5 enclave provisioning was fully manual and took ~8 weeks per environment.
  • STIG hardening alone consumed 400+ engineer-hours per enclave.
  • Result: severely constrained agility and long lead times for new mission capabilities.

2. Pervasive Security Gaps

  • Only ~60% of containers passed DISA STIG scans.
  • Each release introduced 200+ critical CVEs into production.
  • Result: mission systems were routinely exposed to unacceptable cyber risk.

3. Siloed, Manual Compliance Workflows

  • ATO documentation was manually assembled across five disparate tools.
  • Accreditation cycles stretched 6–8 months per release.
  • No automated SBOM tracking or image signing in the air-gapped enclave.
  • Result: limited artifact integrity verification for IL-5 workloads handling classified data.

4. Air-Gap Supply Chain Risks

  • Container images and Helm charts were moved via sneakernet.
  • Provenance and tamper detection were inconsistent and difficult to prove.
  • Result: increased supply chain risk and blind spots in detecting compromise.

These issues directly degraded mission readiness: warfighters waited months for critical updates, and inconsistent security postures left systems vulnerable to exploitation.

The Solution: Automated, Zero-Trust IL-5 Platform

THNKBIG implemented an automated, hardened, and fully auditable software delivery platform tailored for IL-5 constraints.

Air-Gapped Rancher Landing Zone

  • RKE2 clusters via IaC: Terraform + Ansible used to deploy pre-hardened RKE2 clusters into isolated VPCs.
  • Private image mirroring: ECR Private mirrored approved containers from a staging enclave, eliminating risky sneakernet transfers.
  • No internet exposure: VPC Endpoints only; no public internet access paths.
  • Outcome: Provisioning time dropped from 8 weeks to 5 days per IL-5 environment.

GitLab-Centric GitOps Pipeline

  • GitLab as the control plane: Central orchestration for CI/CD and compliance workflows.
  • Signed SBOMs by default: Pipelines auto-generate SBOMs, sign them with cosign, and push to Harbor.
  • GitOps deployments with Rancher Fleet: Helm-based releases are immutable, auditable, and fully traceable.
  • Embedded STIG enforcement: Automated STIG checks in CI/CD; non-compliant builds fail fast before deployment.

NeuVector Zero-Trust Runtime Security

  • Layer-7 microsegmentation: Policies blocked 97% of unnecessary east–west traffic.
  • Rapid DPI baselining: Deep packet inspection baselines established in 48 hours.
  • Continuous runtime defense: Ongoing monitoring and policy enforcement; zero successful container escape attempts during testing.

Automated Compliance & Continuous ATO

  • Integrated scanning: InSpec, OpenSCAP, and Trivy results automatically uploaded to eMASS.
  • Controls as code: Auto-generated ATO packages (SSP, SAR, POA&M) enforcing 349 NIST 800-53 controls as code.
  • Pipeline gates on compliance: Any non-compliant findings fail the pipeline, preventing vulnerable deployments from reaching production.

Implementation Approach

A phased rollout minimized risk while accelerating value:

  1. Discovery (3 weeks)
  • Mapped application and infrastructure dependencies.
  • Designed the Rancher Landing Zone architecture for IL-5 constraints.
  1. Pilot (4 weeks)
  • Hardened RKE2 clusters and integrated Traefik/ALB.
  • Validated NeuVector policies and observability.
  1. Security Testing (3 weeks)
  • Automated STIG enforcement wired into CI/CD.
  • Red-team container escape tests conducted — no successful breaches.
  1. Production Migration (5 days per cluster)
  • Blue–green migration strategy for 112 microservices.
  • Achieved zero downtime during cutover.

The ATO process, historically a 6–8 month effort, was completed in 90 days due to automated evidence collection and documentation. Only three POA&M items were identified, all closed within 30 days. Continuous validation replaced last-minute manual document assembly.

Results & Measurable Impact

  • 90% faster provisioning

IL-5 environments now spin up in 5 days instead of 8 weeks.

  • 94% reduction in critical CVEs

Critical vulnerabilities per release dropped from 200+ to 12.

  • 98% STIG compliance

STIG compliance improved from 60% to 98% across containers and hosts.

  • ATO cycles cut by more than half

Accreditation shrank from 6–8 months to under 3 months.

  • 65% reduction in engineer hours

Per-release engineering effort fell from 400 hours to 140 hours.

  • Zero downtime migrations

112 microservices migrated via blue–green with no service interruption.

  • Weekly secure releases

The program now ships weekly, secure releases without re-accreditation delays.

The resulting zero-trust, automated architecture is aligned with NIST 800-207 and keeps the platform evergreen, eliminating the traditional compliance crunch before audits.

Key Takeaways

  1. IL-5 can move at DevSecOps speed.

Rancher Prime + RKE2 + Infrastructure as Code provides a scalable, repeatable pattern for air-gapped Kubernetes.

  1. GitLab-driven GitOps makes every change provable.

All changes are tracked, signed, and validated before deployment, making immutable releases the default.

  1. NeuVector enforces zero trust at runtime.

It goes beyond static scanning to actively block runtime threats and lateral movement.

  1. Compliance automation enables continuous ATO.

ATO becomes an ongoing, low-friction process instead of a yearly, high-friction event.

  1. This model generalizes beyond defense.

Healthcare, finance, and critical infrastructure can apply the same patterns to move fast without compromising rigor.

Ready to secure your Kubernetes environment?

Explore our Kubernetes consulting services →

Learn about Zero Trust compliance →

Our Approach

Our Kubernetes consulting methodology combines deep platform expertise with proven enterprise practices. We begin with a comprehensive assessment of your current state, including infrastructure inventory, application architecture review, and team capability evaluation. This foundation enables us to develop a tailored roadmap that addresses your specific business objectives while establishing sustainable operational practices.

Engagement Phases

  1. 1
    Discovery and Assessment: Infrastructure audit, application portfolio analysis, and skills gap identification
  2. 2
    Architecture Design: Platform architecture, networking topology, security controls, and GitOps workflow design
  3. 3
    Platform Build: Cluster provisioning, CI/CD pipeline setup, monitoring stack deployment, and policy implementation
  4. 4
    Migration Execution: Workload containerization, staged migration, performance validation, and cutover planning
  5. 5
    Operations Enablement: Runbook development, team training, on-call procedures, and knowledge transfer

Key Deliverables

  • Production-ready Kubernetes platform with hardened security configurations
  • GitOps-based deployment pipelines with automated testing gates
  • Comprehensive monitoring and alerting with custom dashboards
  • Disaster recovery procedures with tested failover capabilities
  • Team enablement program with hands-on training and documentation

Frequently Asked Questions

How long does a typical Kubernetes implementation take?

The timeline for Kubernetes implementation varies based on complexity and scope. A basic production cluster can be deployed in 4-6 weeks, while enterprise-scale implementations with multiple clusters, advanced networking, and comprehensive security typically require 3-6 months. We recommend a phased approach that delivers value incrementally while building toward the complete target architecture.

What Kubernetes distributions do you work with?

We have deep expertise across all major Kubernetes distributions including Amazon EKS, Azure AKS, Google GKE, Red Hat OpenShift, and Rancher. We also work with vanilla Kubernetes and specialized distributions for edge computing and air-gapped environments. Our recommendations are based on your specific requirements rather than vendor preferences.

What compliance frameworks do you support?

We have experience implementing controls for SOC 2, PCI DSS, HIPAA, FedRAMP, NIST 800-53, and CMMC. Our approach uses policy-as-code to automate compliance validation and evidence collection, reducing audit burden while maintaining continuous compliance posture visibility.

How do you implement zero-trust in Kubernetes environments?

We implement zero-trust through multiple layers: service mesh for mutual TLS between services, network policies for microsegmentation, workload identity for cloud resource access, and policy engines like OPA for fine-grained authorization. Every request is authenticated and authorized regardless of network location.

How do you approach client engagements?

Every engagement begins with a thorough discovery phase to understand your current state, business objectives, and constraints. We develop tailored recommendations rather than applying one-size-fits-all solutions. Our consultants work alongside your team to transfer knowledge and build sustainable capabilities. We measure success by business outcomes, not just technical deliverables.

Related Solutions

This case study demonstrates our expertise in the following service areas. Learn more about how we can help your organization achieve similar results.

Cloud Complexity is a Problem — Until You Have the Right Team

From compliance automation to Kubernetes optimization, we help enterprises transform infrastructure into a competitive advantage.

Talk to a Cloud Expert