How to explain to students
Open with the war story: "Production goes down. The team finds a security group rule was 'temporarily' loosened 3 weeks ago for a vendor demo. Nobody documented it. Nobody can answer 'who, when, why'." That's click-ops — managing infrastructure through console clicks. It scales to one engineer; it does not scale to a team.
Infrastructure as Code (IaC) describes infrastructure in text files (HCL, YAML, JSON), commits them to git, and applies them with a tool. Now every change has an author, a date, a reason, a review, and a rollback. "git blame the security group" becomes a real workflow.
git blame on a Terraform file shows who changed every line and why.git revert && terraform apply = roll back any infra change.🎯 Practice Questions
Show Answer
Three failure modes:
1. No audit trail beyond CloudTrail's retention. Who changed what 6 months ago is irrecoverable.
2. Drift between environments. "It works in staging but not prod" because someone clicked something in one and not the other. IaC makes both structurally identical.
3. No review. A single misclick rolls out to production with no second pair of eyes. IaC routes infra changes through PR review, like code.
IaC is also reversible (git revert), portable (same code spins up a new region in 10 minutes), and self-documenting (the code is the architecture diagram).
How to explain to students
A provider is a plugin that knows how to talk to a specific cloud (aws, google, cloudflare, github). A resource is one thing in that cloud (an S3 bucket, an EC2, a DNS record). The state is Terraform's mental map of "what I created last time" — without it, Terraform can't know whether to create, update, or skip.
The lifecycle: init (download providers) → plan (compare desired vs current state, show diff) → apply (do it). Always read the plan before applying. The plan is your safety net.
🎯 Practice Questions
Show Answer
aws, google, cloudflare). Downloaded by terraform init.Resource — one cloud thing you want to exist (S3 bucket, EC2, DNS record).
State — Terraform's record of what it has created. Maps each resource block to a real cloud ID.
Losing state is the scariest. Without state, Terraform doesn't know which resources it owns.
terraform plan will try to create everything fresh alongside your existing resources, leaving you with duplicates. Recovery is manual: terraform import for every resource. Always store state remotely with versioning + locking.
~> operator.terraform apply in production while you're running terraform plan locally. What can go wrong without remote state + locking?How to explain to students
Variables = inputs (what changes between dev/prod). Outputs = exports (the bucket URL, the EC2 IP — what other code needs to consume). Locals = computed values (DRY constants used inside the module). Data sources = read-only lookups against the cloud (e.g. "give me the latest Ubuntu AMI").
Always pass environment-specific values (region, instance size, domain) as variables. .tfvars files hold the actual values per environment. Never hard-code anything that changes between dev and prod.
🎯 Practice Questions
data "aws_ami" lookup instead of hard-coding an AMI ID like ami-0abcd1234?Show Answer
ami-0abcd1234 means:1. Your code only works in one region.
2. After a few months, the AMI is deprecated and replaced — you don't get OS security patches without manual updates.
3. Spinning up a fresh dev environment in a new region requires editing the code.
Using
data "aws_ami" with a name pattern (ubuntu-jammy-22.04-amd64-server-*) makes your config region-portable and automatically picks up the latest patched AMI on every terraform apply.
terraform output -raw bucket_urlHow to explain to students
Beyond init/plan/apply/destroy, the commands you'll reach for most: fmt + validate in CI, state list/show/rm for surgical state edits, import to bring existing resources under Terraform management, and taint/untaint (now -replace) to force a re-create.
🎯 Practice Questions
Show Answer
1. Write the Terraform code that matches the existing resource (a
resource "aws_s3_bucket" "legacy" block with the same bucket name, tags, etc.).2. Import the existing resource:
terraform import aws_s3_bucket.legacy <real-bucket-name>. This populates state without touching the live bucket.3. Run
terraform plan. If your HCL exactly matches the live config, plan should show "No changes". If it shows differences, your HCL is incomplete — iterate until plan is clean.Tip: tools like
terraformer or aws2tf can auto-generate the HCL for many resource types — useful for adopting large legacy estates.
terraform plan -out=tfplan + terraform apply tfplan instead of just terraform apply in CI?terraform fmt -check -recursive. Why -check not just fmt?terraform state rm be the right tool, vs terraform destroy?How to explain to students
A module is a folder of .tf files with declared inputs (variables) and outputs. You "call" a module from another Terraform configuration to instantiate the resources it describes — like calling a function with arguments. Same module, called twice with different vars, gives you dev and prod.
The Terraform Registry (registry.terraform.io) hosts thousands of community modules. terraform-aws-modules/vpc/aws creates a production-grade VPC in 3 lines. Compose them; don't reinvent.
🎯 Practice Questions
version = "5.5.0") and not just track main?Show Answer
main can change at any moment — a maintainer might:1. Push a breaking change (rename a variable, restructure resources) that your
terraform apply would happily execute, destroying live infra.2. Be compromised, with a malicious commit injected into
main.3. Refactor in a way that triggers replacements of stable resources.
Pinning to
5.5.0 means your infrastructure is reproducible across team members and CI runs, and supply-chain attacks are blocked unless you explicitly bump the version. Treat module versions like npm dependencies — pin and review every upgrade.
variables.tf, outputs.tf, and main.tf contain?How to explain to students
Default Terraform stores state in terraform.tfstate in your local folder. This is fine for solo learning. It is fatal for teams — two engineers running apply at the same time corrupts state. Worse, state files contain secrets (DB passwords, API keys). They must never go to git.
The standard fix: store state in S3 with DynamoDB locking. S3 holds the file (versioned + encrypted); DynamoDB holds a lock so only one apply can run at a time. Five extra lines of HCL.
*.tfstate* to .gitignore.🎯 Practice Questions
terraform apply on the same config at the same time. Without a backend lock, what corrupted-state scenarios result?Show Answer
Common patterns:
1. Manual bootstrap via
aws s3 mb + aws dynamodb create-table once, run by hand. Document the steps in the repo's README. The bucket is then "outside" Terraform forever — fine, it's stable.2. A small separate Terraform config (
bootstrap/) that uses local state and creates the S3 bucket + DynamoDB. Run once; commit the local tfstate (it contains no secrets at this point — just bucket names).3. Terragrunt auto-creates the backend bucket if missing.
Whatever you pick, document it loudly so the next engineer doesn't run
terraform destroy on the bootstrap.
terraform force-unlock <LOCK_ID> — but verify nothing is actually running first.key attribute (e.g. myapp/prod/terraform.tfstate vs myapp/dev/terraform.tfstate)? What pattern does this enable?How to explain to students
Terraform errors are notoriously specific ("InvalidParameterCombination: ... requires the InstanceType to be a member of the t3 family"). AI is great at translating these into the actual missing argument. It's also great at scaffolding modules: "Write me a module for an S3 bucket + CloudFront + Route 53 alias for a given domain" gets you 90% of the way.
The trap: AI sometimes uses deprecated resource arguments (e.g. aws_s3_bucket_acl on aws_s3_bucket directly — that moved to a separate resource in v4 of the AWS provider). Always run terraform validate + terraform plan, and check the registry docs for any unfamiliar argument.
🎯 Practice Questions
aws_s3_bucket with an inline acl = "private" argument. Why might that fail in AWS provider v5+, and what's the modern pattern?Show Answer
aws_s3_bucket into the bucket itself + many separate resources for individual settings: aws_s3_bucket_acl, aws_s3_bucket_versioning, aws_s3_bucket_public_access_block, etc.AI trained on older docs sometimes generates the inline form.
terraform validate may pass but apply fails or — worse — silently ignores the setting.Modern pattern:
resource "aws_s3_bucket" "this" { bucket = "..." }resource "aws_s3_bucket_versioning" "this" { bucket = aws_s3_bucket.this.id, versioning_configuration { status = "Enabled" } }Always cross-check AI output against
registry.terraform.io/providers/hashicorp/aws for the current resource shape.
terraform.tfstate file to AI a security mistake? What's safe to paste instead?How to explain to students
In the AWS module, students built a portfolio site by clicking through S3, CloudFront, ACM, Route 53. Now they rebuild it as Terraform — same architecture, but every resource declared in HCL. The artefact: a terraform apply that produces a working https://<name>.com portfolio in 3 minutes, and a terraform destroy that removes it cleanly.
provider "aws" + an alias = "us_east_1" for the cert. One config, two regions.Sample quiz questions (interactive)
Fill-in-the-command
old-data into Terraform state as aws_s3_bucket.legacy.Assignment
📋 Assignment Requirements
- Take the AWS-module portfolio site (S3 + CloudFront + Route 53) and recreate it entirely in Terraform
- State must live in S3 + DynamoDB locking — local state is an automatic fail
- Wrap the architecture in a reusable module at
./modules/static-site - Module variables:
domain,bucket_name,tags. Module outputs:distribution_id,bucket_id - Use
data "aws_route53_zone"to look up the existing hosted zone - Pin AWS provider version (
~> 5.0) and Terraform itself (required_version) - Include
terraform fmt -check+terraform validatein a CI job - Commit a
README.mdwith: how to bootstrap state, how to apply, the architecture diagram - Bonus: Use
tfsecorcheckovto scan and fix at least one finding - Bonus: Pass
environmentas a variable and call the module twice (dev + prod) with different domains - Bonus: Use a community VPC module from the registry to spin up a VPC alongside
create_before_destroy on cert, hard-coded bucket name (collides on second apply), local state.terraform plan on every PR.