Why Terraform for Multi-Cloud?
HashiCorp Terraform has become the de-facto Infrastructure as Code tool for organisations running workloads across multiple cloud providers. Its provider ecosystem covers over 3,000 providers — including AWS, Azure, GCP, Kubernetes, Cloudflare, GitHub, Datadog, and PagerDuty — meaning virtually any infrastructure resource can be expressed as a Terraform resource. The declarative HCL syntax, plan-before-apply workflow, and state management model are well-suited to the needs of platform engineering teams building reliable, reproducible infrastructure.
For multi-cloud specifically, Terraform offers a unique advantage: a single tool with consistent workflow concepts (init, plan, apply, destroy) regardless of which provider you are targeting. Your team learns one CLI, one state model, one module system, and one CI/CD integration pattern — then applies it to AWS in one configuration and Azure in another, rather than maintaining expertise in AWS CloudFormation and Azure Bicep and GCP Deployment Manager simultaneously.
Repository Structure
The most critical architectural decision in a Terraform multi-cloud codebase is the repository structure. We recommend a monorepo structure for organisations of up to 30 engineers. The modules/ directory contains reusable, versioned infrastructure building blocks organised by cloud provider. The environments/ directory contains environment-specific root configurations that instantiate modules with environment-appropriate variable values. This separation ensures that the same validated module code runs in dev, staging, and production — only the input variables change.
Remote State Management
Terraform state is the source of truth about your infrastructure. In a team environment, state must be stored remotely with locking to prevent concurrent applies that would corrupt state. The recommended backends are AWS S3 with DynamoDB locking for AWS-primary organisations, Azure Blob Storage with lease-based locking for Azure-primary organisations, or Terraform Cloud/Enterprise for cloud-agnostic state management with additional features like policy enforcement and drift detection.
For multi-cloud setups, we recommend a dedicated AWS account or Azure subscription for shared infrastructure tooling including the Terraform state bucket, CI/CD pipelines, and monitoring tools. State is partitioned by environment and service, with separate state files for networking, compute, and data tiers — limiting the blast radius of a botched apply to one tier rather than the entire environment.
Module Design Principles
Well-designed Terraform modules share four characteristics:
- Single Responsibility: A module should provision one logical resource group — a VPC, a Kubernetes cluster, or a database cluster. Avoid kitchen-sink modules that create dozens of resource types; they become impossible to test and reuse.
- Opinionated Defaults: Modules should encode your organisation's standards as defaults. A module for an EKS cluster should default to encrypted EBS volumes, CloudWatch logging enabled, and a specific validated node group configuration — so application teams cannot accidentally deploy insecure clusters.
- Explicit Outputs: All resources that consumers might need to reference — VPC IDs, subnet IDs, security group IDs, cluster endpoints — should be explicit module outputs to avoid requiring consumers to use data sources to look up resources.
- Semantic Versioning: Tag module releases with semantic versions and reference specific versions in environment configurations to prevent upstream changes from automatically affecting production environments.
CI/CD Integration
Infrastructure changes should follow the same review and approval process as application code. A typical Terraform CI/CD pipeline in GitHub Actions: on Pull Request, run terraform fmt -check, terraform validate, and terraform plan, posting the plan output as a PR comment. Run tfsec and checkov for security policy checks and require plan approval from a platform team member before merge. On merge to main, run terraform apply against the non-production environment automatically. Require a manual approval gate before applying to production.
Provider Authentication
Never hardcode cloud credentials in Terraform configurations or CI/CD environment variables. Use OIDC federation between your CI/CD system and your cloud providers to issue short-lived credentials for each pipeline run. GitHub Actions supports OIDC federation with AWS, Azure, and GCP out of the box — eliminating long-lived access keys entirely. For human operator access, use Vault dynamic credentials or cloud-native temporary credentials rather than static IAM user keys.
Drift Detection
Run terraform plan on a nightly schedule and alert on any detected drift — resources that exist in the cloud but not in state, or state that diverges from the live configuration. Drift indicates either unmanaged manual changes or a bug in a previous apply. PCCVDI Solutions implements drift detection for all clients on managed infrastructure retainers, typically surfacing several manual console changes per month that would otherwise create inconsistencies between environments and potential security policy violations.