← Back to all topics
$ git push origin main # then watch the magic

CI/CD Pipelines
Instructor Guide

Automate test, build, and deploy with GitHub Actions — from first workflow to OIDC-secured AWS deploys

01
Why CI/CD Matters in DevOps
From "deploy on Friday and pray" to "ship 30 times a day with confidence"

How to explain to students

Ask the room: "Imagine you copy-paste a deploy on Friday at 5pm. Something breaks. It's now 7pm and you don't remember which file you edited. What happens next?" That's the world without CI/CD. CI = Continuous Integration: every push runs the tests automatically. CD = Continuous Delivery / Deployment: every green build can be pushed to production with one click — or zero, if you trust your tests.

The numbers from Accelerate (the seminal DevOps research book): elite teams deploy 973× more frequently than low performers, with 3× lower failure rates. CI/CD is how that math works.

bash — the lifecycle
# Day 1: Manual deploys (the bad path)
$ ssh prod "cd /app && git pull && pm2 restart all"  # 😱

# Day 30: With CI/CD
$ git push origin main
→ tests run on every push (CI)
→ image built + pushed to ECR
→ staging auto-deployed
→ production deploy: one click on GitHub (CD)
→ rollback in < 30 seconds if something breaks

# Same engineer, same code — different outcomes
🚀
Ship faster
From weekly to multiple-times-daily deploys.
🛡️
Catch bugs early
Tests run on every commit, not just before release.
↩️
Easy rollback
Every deploy is a tagged artifact — revert with one click.
📜
Auditable
Every deploy logged: who, when, what commit, with what tests passing.

🎯 Practice Questions

Q1.
In one sentence each, define CI, CD (Continuous Delivery), and CD (Continuous Deployment). Where do they differ?
Show Answer
CI (Continuous Integration): every push to the repo automatically builds the code and runs the test suite. Goal — never let main rot.

CD (Continuous Delivery): every green build is automatically packaged into a deployable artifact (Docker image, JAR, etc.) and pushed to a staging environment. A human still clicks "deploy to production."

CD (Continuous Deployment): same as above, but production deploys are automatic too — no human in the loop. Requires very strong test coverage and observability. Most teams stop at Continuous Delivery.
Q2.
List three concrete things that go wrong with the "ssh + git pull + restart" deploy pattern that CI/CD prevents.
Q3.
A startup wants to "do CI/CD" but has no automated tests. Why is that the wrong order? What's the first investment they should make instead?
💡 CI without tests is just a glorified npm install.
02
Pipeline Stages — Build, Test, Deploy
The anatomy of every CI/CD pipeline, regardless of which tool runs it

How to explain to students

Every CI/CD pipeline is a directed sequence of stages. Earlier stages are cheap and fast (linting); later stages are slow and expensive (deploying to production). Fail-fast: don't waste 20 minutes building a Docker image when npm run lint would have caught the bug in 5 seconds.

The canonical sequence: checkout → install deps → lint → test → build → scan → publish artifact → deploy. Some pipelines add quality gates between stages — for example, "block production deploy unless code coverage ≥ 80%."

.github/workflows/ci.yml — minimal stages
name: CI
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - run: npm run lint  # fast — fail early
      - run: npm test -- --coverage
      - run: npm run build
      - uses: actions/upload-artifact@v4
        with: { name: dist, path: dist/ }

# Visualization in the GitHub UI:
checkout (1s)
setup-node (3s)
npm ci (12s)
lint (4s)
test (28s)
build (18s)
upload-artifact (2s)
checkout setup-node npm ci lint test build deploy

🎯 Practice Questions

Q1.
Why does npm ci appear in CI pipelines instead of npm install? What's the difference?
Show Answer
npm install resolves dependencies from package.json — and updates package-lock.json if it can. npm ci ("clean install") refuses to touch the lock file: it does an exact, reproducible install based on package-lock.json alone. If the lock file is out of sync, ci fails the build.

Why CI prefers ci: reproducibility — every build gets the exact same dependency tree. install can silently upgrade a transitive dep and ship a different bundle than yesterday.
Q2.
Order these stages from cheap+fast to slow+expensive: npm test, npm run lint, docker build, terraform plan, aws ecs update-service. Why does this order matter?
Q3.
A pipeline runs lint, test, and build in sequence (one job). Each takes 30s. Convert it to parallel jobs to halve total time. Sketch the YAML structure.
💡 Multiple jobs at the top level run in parallel by default.
Q4.
Add a "code-coverage gate" to the pipeline: fail the build if coverage drops below 80%. Where in the pipeline does it go, and what's one tool that enforces it?
03
Deployment Strategies — Blue-Green, Canary, Rolling
How to deploy without downtime — and roll back in seconds when things break

How to explain to students

There are three big strategies that show up in every interview. Use the traffic light analogy: imagine a busy intersection with old + new traffic patterns.

Rolling: replace one server at a time. Default for ECS, Kubernetes, Beanstalk. Cheap. Slow rollback.
Blue-Green: spin up a full second environment ("green"), test it, then switch the load balancer over. Instant rollback. Doubles infra cost during deploy.
Canary: route 1% → 10% → 50% → 100% of traffic to the new version, watching error rates. Best for risky changes. Most complex.

strategy comparison
# ── ROLLING ────────────────────────────────────
[v1][v1][v1][v1] → [v2][v1][v1][v1] → [v2][v2][v1][v1] → [v2][v2][v2][v2]
# Pros: cheap, default in K8s/ECS Cons: slow rollback, mixed versions during deploy

# ── BLUE-GREEN ─────────────────────────────────
BLUE [v1][v1][v1][v1] ← 100% traffic
GREEN [v2][v2][v2][v2] ← warmed up, smoke-tested
↓ flip load balancer
BLUE [v1][v1][v1][v1] ← still hot, instant rollback
GREEN [v2][v2][v2][v2] ← 100% traffic
# Pros: instant rollback, no mixed versions Cons: 2× infra cost during cutover

# ── CANARY ─────────────────────────────────────
v1 ████████████████████ 99%
v2 ▓ 1% ← watch error rate, latency
v2 ▓▓▓▓ 10% ← 5min later, looks good
v2 ▓▓▓▓▓▓▓▓▓▓ 50%
v2 ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100% ← v1 retired
# Pros: lowest blast radius Cons: complex traffic shaping, longer deploy
🟢
Rolling = default
Free, supported everywhere. Use unless you have a reason not to.
🔵
Blue-green = critical apps
When seconds of downtime cost more than 2× infra for an hour.
🐤
Canary = risky changes
Major refactors, schema migrations, ML model rollouts.
🩺
Health checks are non-optional
Without a healthcheck, a "successful deploy" can serve 500s for an hour.

🎯 Practice Questions

Q1.
A bank's payment service can't tolerate any in-flight version mixing during deploys. Which strategy fits, and why are rolling deploys problematic here?
Q2.
Your team wants canary deploys on AWS. Beyond the application code, what infrastructure piece must exist to shape traffic 1% → 10% → 50% → 100%?
Show Answer
A traffic-shaping load balancer or service mesh. On AWS, the common options are:

1. ALB weighted target groups (CodeDeploy can drive this).
2. App Mesh / Istio / Linkerd for fine-grained percentage routing.
3. API Gateway canary releases for serverless/REST APIs.

Without one of these, you can only do "all-or-nothing" deploys. The percentage-based shifting is the load balancer's job, not the application's.
Q3.
Why is "instant rollback" the killer feature of blue-green? Sketch what rolling-deploy rollback looks like in comparison.
Q4.
A canary at 5% reports 2× the error rate of v1. List three things you check before deciding to abort vs. continue.
💡 Sample size, error type, latency.
04
GitHub Actions Basics — Workflows, Jobs, Steps & Triggers
The mental model: workflow → jobs → steps. Triggered by push, PR, cron, or manual dispatch.

How to explain to students

A workflow is one YAML file in .github/workflows/. It contains one or more jobs. Each job runs on a fresh runner (a clean VM). Jobs run in parallel by default; use needs: to make one wait for another. Inside a job, steps run sequentially.

The trigger is the on: key. Common options: push, pull_request, schedule (cron), workflow_dispatch (manual button), release. You can also filter by branch (branches: [main]) or path (paths: ['src/**']).

.github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main, develop]
  pull_request:
    paths: ['src/**', 'package*.json']
  schedule:
    - cron: '0 2 * * *'  # nightly 2am UTC
  workflow_dispatch:  # manual "Run workflow" button

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm run lint

  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm test

  build:
    needs: [lint, test]  # waits for both
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm run build
workflow job step on: push on: pull_request schedule workflow_dispatch needs:

🎯 Practice Questions

Q1.
Two jobs are defined at the top level with no needs: between them. Do they run in parallel or sequentially? On the same runner or different runners?
Show Answer
Parallel, on different runners. Each job spawns its own fresh ubuntu-latest VM (or whatever the runs-on says). They share no filesystem state by default — a file written in job A is invisible to job B unless you upload it as an artifact.

If you need a strict order, add needs: [job-a] to the dependent job. If you need to share files, use actions/upload-artifact + actions/download-artifact.
Q2.
Write the on: block for a workflow that runs on every push to main, every pull request targeting main, and at 3am UTC daily.
Q3.
Add path filtering so the workflow only runs when files under src/ or package*.json change. Why does this matter for monorepos?
Q4.
A teammate uses uses: actions/checkout@main. You convince them to switch to @v4. What's the security risk of pinning to a moving branch?
💡 Supply-chain attack — if the action's repo is compromised, you'll silently pull the malicious code.
05
Matrix Builds — Test Across Versions, OSes, & Browsers
One YAML block, dozens of parallel test runs

How to explain to students

Matrix builds are how you say "run this job for every combination of these inputs". Most common: testing a Node.js library against Node 18, 20, and 22 simultaneously. Or a Python tool on Ubuntu, macOS, and Windows. GitHub spins up a runner per combination — all in parallel.

.github/workflows/test.yml — matrix
jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: false  # let all combos finish even if one fails
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        node: [18, 20, 22]
        exclude:
          - { os: windows-latest, node: 18 }  # skip combos
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: ${{ matrix.node }} }
      - run: npm ci && npm test

# Result: 8 parallel runs (3 OS × 3 node, minus 1 excluded)
✓ test (ubuntu-latest, 18) ✓ test (macos-latest, 18)
✓ test (ubuntu-latest, 20) ✓ test (macos-latest, 20) ✓ test (windows-latest, 20)
✓ test (ubuntu-latest, 22) ✓ test (macos-latest, 22) ✓ test (windows-latest, 22)
All in parallel
8 combos = 8 runners. Total time ≈ slowest single run.
🚫
exclude / include
Skip impossible combos or add specific extras to the matrix.
⏸️
fail-fast: false
Let all combos finish — gives you the full picture, not just the first failure.
💸
Mind the minutes
macOS runners cost 10× Linux. Don't matrix everything — be intentional.

🎯 Practice Questions

Q1.
Write a matrix that tests against Python 3.10, 3.11, and 3.12 on Ubuntu only. Include the matrix.python reference in actions/setup-python@v5.
Q2.
A 3×3×2 matrix runs 18 jobs but you only need 6 specific combinations. Which is more idiomatic — using exclude heavily, or rewriting with include?
💡 Read the GitHub docs for include — it lets you specify exact combos.
Q3.
Why is fail-fast: false usually the right choice for a matrix that tests cross-OS compatibility?
Show Answer
With fail-fast: true (the default), the first failing combination cancels all the others. You then know that one failed, but not whether the others would have passed or failed too.

For cross-OS / cross-version compatibility tests, you want the full picture: "Windows + Node 22 fails, but everything else passes" tells you it's a Windows-specific issue. Setting fail-fast: false lets every combo run to completion so you see all the data at once.

For deploy pipelines (where you only need one green run), the default fail-fast: true is correct — there's no value in burning compute on combos you'll cancel anyway.
06
Secrets, Environments & Reusable Workflows
How to keep credentials out of YAML and reuse logic across repos

How to explain to students

Never paste secrets into YAML. GitHub provides three layers: repo-level secrets, environment-level secrets (with optional approval gates), and org-level secrets (shared across all repos). Always use ${{ secrets.NAME }} in workflows.

Environments are how you wire approval gates: a job targeting the production environment can require a human reviewer click "approve" before it runs. Reusable workflows let you DRY up shared logic — one "deploy.yml" reused by 10 microservices.

workflow with environments + secrets
jobs:
  deploy-staging:
    runs-on: ubuntu-latest
    environment: staging  # pulls staging-scoped secrets
    steps:
      - run: aws s3 sync ./dist s3://staging-bucket/
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

  deploy-prod:
    needs: deploy-staging
    runs-on: ubuntu-latest
    environment:
      name: production
      url: https://app.example.com
    steps:
      - run: aws ecs update-service ...  # waits for human approval

Practical: A reusable workflow that 10 repos can call

.github/workflows/reusable-deploy.yml (in shared repo)
on:
  workflow_call:  # this is what makes it reusable
    inputs:
      service:
        required: true
        type: string
    secrets:
      aws-key: { required: true }
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - run: aws ecs update-service --service ${{ inputs.service }}

# Caller workflow in a different repo
jobs:
  deploy-api:
    uses: my-org/shared-workflows/.github/workflows/reusable-deploy.yml@v1
    with: { service: api }
    secrets: { aws-key: ${{ secrets.AWS_KEY }} }

🎯 Practice Questions

Q1.
An engineer commits AWS_ACCESS_KEY=AKIA... directly into a workflow YAML and pushes. They realise the mistake 30 seconds later and force-push to remove it. Are they safe? Why or why not?
Show Answer
No, not safe. Even after a force-push, the original commit may still exist:

1. In GitHub's reflog for ~30 days.
2. In any clone any teammate or CI runner pulled before the rewrite.
3. In cached search-engine results if the repo is public.
4. Bots scrape commits in real time — automated AWS-key scanners detect leaks within minutes and abuse them.

Real fix: immediately rotate the credential in AWS (delete the IAM access key), generate a new one, store it in GitHub Secrets, and review CloudTrail for any unauthorized usage. Then add the secret-scanning push-protection rule so it can't happen again.
Q2.
Why use a GitHub environment ("production") rather than a regular repo secret for production credentials? What capabilities does an environment unlock?
Q3.
Sketch a reusable workflow your team would call from 5 different microservice repos. What inputs / secrets would it accept?
Q4.
A secret printed via echo $MY_SECRET in a step appears as *** in the logs. Does that mean it's safe? What if you base64-encoded it first and printed?
💡 GitHub masks the literal string only.
07
OIDC Federation — AWS Auth Without Long-Lived Keys
The modern, secure way to deploy from GitHub Actions to AWS — no AWS_SECRET_ACCESS_KEY in sight

How to explain to students

The old way: create an IAM user, generate an access key, paste it into secrets.AWS_SECRET_ACCESS_KEY, hope nobody leaks it. The key never expires, and if it leaks, you have a 12-hour window to notice and rotate before it's abused.

The new way: OpenID Connect federation. GitHub mints a short-lived JWT for each workflow run; AWS verifies the JWT was signed by GitHub and matches a trusted repo+branch; AWS gives you a 15-minute STS credential. No long-lived secrets exist anywhere. This is the AWS-recommended pattern as of 2023+.

aws — set up the OIDC trust (one time)
# 1. Register GitHub as an OIDC provider in your AWS account
$ aws iam create-open-id-connect-provider \
  --url https://token.actions.githubusercontent.com \
  --client-id-list sts.amazonaws.com \
  --thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1

# 2. Create an IAM role that trusts GitHub for a SPECIFIC repo + branch
// trust-policy.json
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "Federated": "arn:aws:iam::123:oidc-provider/token.actions.githubusercontent.com" },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
      },
      "StringLike": {
        "token.actions.githubusercontent.com:sub": "repo:my-org/my-app:ref:refs/heads/main"
      }
    }
  }]
}

$ aws iam create-role --role-name gh-deployer \
  --assume-role-policy-document file://trust-policy.json
.github/workflows/deploy.yml — using OIDC
permissions:
  id-token: write  # required to mint the JWT
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS via OIDC
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/gh-deployer
          aws-region: eu-west-1

      - run: aws s3 ls  # now authenticated, no AWS_* secrets needed
🔐
No long-lived keys
15-min STS credentials minted per run — nothing to leak.
🎯
Repo + branch scoped
Trust policy locks the role to one specific repo and branch.
📜
Auditable in CloudTrail
Every assume-role call logs the GitHub run URL — full traceability.
🆔
id-token: write
Required permission. Forget it and the auth fails with no helpful error.

🎯 Practice Questions

Q1.
A workflow with OIDC fails with "Could not load credentials from any providers." The role ARN is correct. What's the most likely missing line?
Show Answer
The permissions: id-token: write block at the workflow or job level. Without it, GitHub will not mint the OIDC JWT for the runner, and aws-actions/configure-aws-credentials has no token to exchange.

Add at the top of the workflow:
permissions:
  id-token: write
  contents: read


Other common causes: trust policy sub condition doesn't match the repo / branch / environment that's actually running.
Q2.
Compare the blast radius of a leaked long-lived AWS access key vs a leaked GitHub OIDC token. Which costs more to rotate, and why?
Q3.
Write the StringLike condition that lets the role be assumed only from PRs targeting main in my-org/my-app.
💡 PRs use a different sub claim than branch pushes.
Q4.
Why is OIDC federation more secure than even a properly-scoped IAM user with a rotated key? What attack does it prevent that rotation doesn't?
08
Using AI to Write & Debug Workflows
Workflow YAML is the perfect AI task — finicky syntax, well-known patterns, lots of training data

How to explain to students

YAML indentation is the #1 source of CI failures. AI is great at generating workflow scaffolds and debugging the cryptic errors, but it can also confidently invent action names that don't exist. Verify by clicking the action's GitHub link — if it 404s, the AI hallucinated.

bash — AI prompt examples
# ❌ Weak prompt
"Write a GitHub Actions workflow"

# ✅ Strong prompt
"Write a GitHub Actions workflow that:
- Runs on push to main and on every PR targeting main
- Has 3 parallel jobs: lint (eslint), test (jest --coverage), typecheck (tsc --noEmit)
- A 4th job that builds a Docker image, depends on all 3 above
- Uses OIDC to push the image to ECR (no long-lived AWS keys)
- Tags the image with the git SHA
- Pin every action to a specific @vN tag (no @main)
After the YAML, list 3 things that will break this workflow in production."

# Debug-by-AI: paste the EXACT error
Error: Resource not accessible by integration
→ Prompt: "GitHub Actions error 'Resource not accessible by integration'
when calling the GitHub API. What permissions key am I missing in my workflow?"
ChatGPT Claude GitHub Copilot Verify action exists Pin versions

🎯 Practice Questions

Q1.
An AI-generated workflow uses uses: super-actions/awesome-deployer@latest. List two red flags before you merge.
Show Answer
1. The action might not exist. AI sometimes invents plausible action names. Click the link github.com/super-actions/awesome-deployer — if 404, it's a hallucination.
2. @latest is not a valid GitHub Actions ref. Actions are pinned by branch, tag, or SHA — there is no automatic "@latest". Even if it resolved to "default branch", that's a supply-chain risk: anyone who compromises the action's repo gets RCE in your CI.

Fix: verify the action exists, read its README, then pin to a specific version tag (@v3) or — for high-security CI — a full commit SHA (@a1b2c3d...) so even tag retargeting can't compromise you.
Q2.
Take a 10-line workflow with broken indentation that fails with "YAML mapping values not allowed here". Write the AI prompt that would identify the exact line and fix.
Q3.
Why should you never paste your secrets.AWS_SECRET_ACCESS_KEY into an AI tool, even when asking for help?
09
Project: Full CI Pipeline — Test → Build → Deploy to AWS via GHA
The capstone — combine everything into one production-grade workflow

How to explain to students

Walk through this on screen, then have students recreate it on their own Node app. The workflow combines: matrix testing, Docker build, OIDC AWS auth, ECR push, ECS deploy, environment-gated production approval. This is what a real team's main.yml looks like.

.github/workflows/main.yml — full pipeline
name: Build and Deploy
on:
  push: { branches: [main] }
  pull_request: { branches: [main] }

permissions:
  id-token: write
  contents: read

jobs:

  # ── 1. Test (parallel matrix) ───────────────
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix: { node: [18, 20, 22] }
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: ${{ matrix.node }}, cache: npm }
      - run: npm ci && npm run lint && npm test

  # ── 2. Build + push image to ECR (only on main) ─
  build:
    needs: test
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    outputs: { tag: ${{ steps.meta.outputs.tag }} }
    steps:
      - uses: actions/checkout@v4
      - id: meta
        run: echo "tag=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123:role/gh-deployer
          aws-region: eu-west-1
      - uses: aws-actions/amazon-ecr-login@v2
      - run: |
        docker build -t myapp:${{ steps.meta.outputs.tag }} .
        docker tag myapp:${{ steps.meta.outputs.tag }} \
          123.dkr.ecr.eu-west-1.amazonaws.com/myapp:${{ steps.meta.outputs.tag }}
        docker push 123.dkr.ecr.eu-west-1.amazonaws.com/myapp:${{ steps.meta.outputs.tag }}

  # ── 3. Deploy to production (gated) ──────────
  deploy:
    needs: build
    runs-on: ubuntu-latest
    environment: production  # requires approval
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with: { role-to-assume: arn:aws:iam::123:role/gh-deployer, aws-region: eu-west-1 }
      - run: aws ecs update-service \
        --cluster prod --service myapp \
        --force-new-deployment
🔁
Matrix on test only
Build + deploy run once — test runs N× across versions.
🚦
if: ref == main
PRs run tests but don't deploy. Only main triggers the build/deploy chain.
🔐
OIDC end-to-end
Zero long-lived AWS credentials in the workflow.
🛂
Production approval
environment: production requires a human click before deploy.
10
Quiz: GitHub Actions YAML + Trigger Events
5 MCQs + 2 fill-in-the-command questions

Sample quiz questions (interactive)

Q1. Two top-level jobs with no needs: — how do they execute?
A
In parallel, on separate runners
B
Sequentially, on the same runner
C
Sequentially, on separate runners
D
Order is undefined
Q2. Which permission is required for OIDC authentication to AWS?
A
contents: write
B
id-token: write
C
packages: write
D
actions: write
Q3. A canary deployment routes 5% of traffic to v2 and shows 2× error rate. Best first action?
A
Push to 100% — bigger sample is better
B
Ignore and continue rollout
C
Hold the canary, inspect error type and sample size before deciding
D
Roll back the entire infrastructure
Q4. The strategy: rolling, blue-green, canary — which has the lowest deploy-time infra cost?
A
Rolling
B
Blue-green
C
Canary
D
They're identical
Q5. Why npm ci over npm install in CI?
A
It's faster only — no other difference
B
It's reproducible — fails if package-lock is out of sync, never mutates it
C
It installs devDependencies only
D
It runs npm audit automatically

Fill-in-the-command

Fill 1: Cron expression for "every weekday (Mon–Fri) at 9:00 AM UTC".
Fill 2: The full on: trigger that fires on every push to main and on the manual "Run workflow" button.
11
Assignment: Add a Test + Build Workflow to an Existing Repo
Take any of your earlier projects and bolt on a real CI pipeline

How to explain to students

Frame as a hiring task: "Pick any of your existing GitHub repos. Add a CI workflow that runs lint + test + build on every PR. Add a green check to the README. You have a weekend." This is the single most common DevOps interview prompt for junior roles.

📋 Assignment Requirements

  • Pick any existing repo (yours or a fork). Must have at least 3 source files and a test command.
  • Create .github/workflows/ci.yml that triggers on push and pull_request to main
  • 3 parallel jobs: lint, test, build (use needs: only where required)
  • Use a matrix to test against at least 2 versions of your runtime (Node 18+20, Python 3.11+3.12, etc.)
  • Cache dependencies via actions/setup-node (or equivalent) to keep CI under 2 minutes
  • Pin every action to a specific @vN tag — no @main or @latest
  • Add a status badge to your README that links to the workflow runs
  • Bonus: A 4th job that builds (but does not push) a Docker image, only on push to main
  • Bonus: Add Trivy scanning for the Docker image and fail on HIGH/CRITICAL CVEs
  • Bonus: Configure branch protection so PRs cannot merge without all 3 jobs passing
expected README badge
[![CI](https://github.com/USER/REPO/actions/workflows/ci.yml/badge.svg)](https://github.com/USER/REPO/actions/workflows/ci.yml)

# In a passing PR:
✓ lint (8s)
✓ test (Node 20) (24s)
✓ test (Node 22) (22s)
✓ build (16s)
All checks have passed — Mergeable
📊
Grading rubric
Workflow runs: 25pts. Matrix correct: 20pts. Caching works: 15pts. Pinned actions: 15pts. Badge in README: 10pts. Code quality: 15pts.
🎯
Common mistakes
Forgot cache: npm, used @main, jobs run sequentially because of unnecessary needs:, badge URL wrong.
💡
Stretch goal
Add OIDC + an environment-gated production deploy step that's actually wired to a real AWS account.