Why CI/CD Matters in DevOps

From "deploy on Friday and pray" to "ship 30 times a day with confidence"

▾

How to explain to students

Ask the room: "Imagine you copy-paste a deploy on Friday at 5pm. Something breaks. It's now 7pm and you don't remember which file you edited. What happens next?" That's the world without CI/CD. CI = Continuous Integration: every push runs the tests automatically. CD = Continuous Delivery / Deployment: every green build can be pushed to production with one click — or zero, if you trust your tests.

The numbers from Accelerate (the seminal DevOps research book): elite teams deploy 973× more frequently than low performers, with 3× lower failure rates. CI/CD is how that math works.

bash — the lifecycle

# Day 1: Manual deploys (the bad path)

$ ssh prod "cd /app && git pull && pm2 restart all" # 😱

# Day 30: With CI/CD

$ git push origin main

→ tests run on every push (CI)

→ image built + pushed to ECR

→ staging auto-deployed

→ production deploy: one click on GitHub (CD)

→ rollback in < 30 seconds if something breaks

# Same engineer, same code — different outcomes

🚀

Ship faster

From weekly to multiple-times-daily deploys.

🛡️

Catch bugs early

Tests run on every commit, not just before release.

↩️

Easy rollback

Every deploy is a tagged artifact — revert with one click.

📜

Auditable

Every deploy logged: who, when, what commit, with what tests passing.

🎯 Practice Questions

Q1.

In one sentence each, define CI, CD (Continuous Delivery), and CD (Continuous Deployment). Where do they differ?

Show Answer

CI (Continuous Integration): every push to the repo automatically builds the code and runs the test suite. Goal — never let main rot.

CD (Continuous Delivery): every green build is automatically packaged into a deployable artifact (Docker image, JAR, etc.) and pushed to a staging environment. A human still clicks "deploy to production."

CD (Continuous Deployment): same as above, but production deploys are automatic too — no human in the loop. Requires very strong test coverage and observability. Most teams stop at Continuous Delivery.

Q2.

List three concrete things that go wrong with the "ssh + git pull + restart" deploy pattern that CI/CD prevents.

Q3.

A startup wants to "do CI/CD" but has no automated tests. Why is that the wrong order? What's the first investment they should make instead?

💡 CI without tests is just a glorified npm install.

02

Pipeline Stages — Build, Test, Deploy

The anatomy of every CI/CD pipeline, regardless of which tool runs it

▾

How to explain to students

Every CI/CD pipeline is a directed sequence of stages. Earlier stages are cheap and fast (linting); later stages are slow and expensive (deploying to production). Fail-fast: don't waste 20 minutes building a Docker image when npm run lint would have caught the bug in 5 seconds.

The canonical sequence: checkout → install deps → lint → test → build → scan → publish artifact → deploy. Some pipelines add quality gates between stages — for example, "block production deploy unless code coverage ≥ 80%."

.github/workflows/ci.yml — minimal stages

on: [push, pull_request]

jobs:

test:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- uses: actions/setup-node@v4

with: { node-version: 20, cache: npm }

- run: npm ci

- run: npm run lint # fast — fail early

- run: npm test -- --coverage

- run: npm run build

- uses: actions/upload-artifact@v4

with: { name: dist, path: dist/ }

# Visualization in the GitHub UI:

● checkout (1s)

● setup-node (3s)

● npm ci (12s)

● lint (4s)

● test (28s)

● build (18s)

● upload-artifact (2s)

checkout setup-node npm ci lint test build deploy

🎯 Practice Questions

Q1.

Why does npm ci appear in CI pipelines instead of npm install? What's the difference?

Show Answer

npm install resolves dependencies from package.json — and updates package-lock.json if it can. npm ci ("clean install") refuses to touch the lock file: it does an exact, reproducible install based on package-lock.json alone. If the lock file is out of sync, ci fails the build.

Why CI prefers ci: reproducibility — every build gets the exact same dependency tree. install can silently upgrade a transitive dep and ship a different bundle than yesterday.

Q2.

Order these stages from cheap+fast to slow+expensive: npm test, npm run lint, docker build, terraform plan, aws ecs update-service. Why does this order matter?

Q3.

A pipeline runs lint, test, and build in sequence (one job). Each takes 30s. Convert it to parallel jobs to halve total time. Sketch the YAML structure.

💡 Multiple jobs at the top level run in parallel by default.

Q4.

Add a "code-coverage gate" to the pipeline: fail the build if coverage drops below 80%. Where in the pipeline does it go, and what's one tool that enforces it?

03

Deployment Strategies — Blue-Green, Canary, Rolling

How to deploy without downtime — and roll back in seconds when things break

▾

How to explain to students

There are three big strategies that show up in every interview. Use the traffic light analogy: imagine a busy intersection with old + new traffic patterns.

Rolling: replace one server at a time. Default for ECS, Kubernetes, Beanstalk. Cheap. Slow rollback.
Blue-Green: spin up a full second environment ("green"), test it, then switch the load balancer over. Instant rollback. Doubles infra cost during deploy.
Canary: route 1% → 10% → 50% → 100% of traffic to the new version, watching error rates. Best for risky changes. Most complex.

strategy comparison

# ── ROLLING ────────────────────────────────────

[v1][v1][v1][v1] → [v2][v1][v1][v1] → [v2][v2][v1][v1] → [v2][v2][v2][v2]

# Pros: cheap, default in K8s/ECS Cons: slow rollback, mixed versions during deploy

# ── BLUE-GREEN ─────────────────────────────────

BLUE [v1][v1][v1][v1] ← 100% traffic

GREEN [v2][v2][v2][v2] ← warmed up, smoke-tested

↓ flip load balancer

BLUE [v1][v1][v1][v1] ← still hot, instant rollback

GREEN [v2][v2][v2][v2] ← 100% traffic

# Pros: instant rollback, no mixed versions Cons: 2× infra cost during cutover

# ── CANARY ─────────────────────────────────────

v1 ████████████████████ 99%

v2 ▓ 1% ← watch error rate, latency

v2 ▓▓▓▓ 10% ← 5min later, looks good

v2 ▓▓▓▓▓▓▓▓▓▓ 50%

v2 ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100% ← v1 retired

# Pros: lowest blast radius Cons: complex traffic shaping, longer deploy

🟢

Rolling = default

Free, supported everywhere. Use unless you have a reason not to.

🔵

Blue-green = critical apps

When seconds of downtime cost more than 2× infra for an hour.

🐤

Canary = risky changes

Major refactors, schema migrations, ML model rollouts.

🩺

Health checks are non-optional

Without a healthcheck, a "successful deploy" can serve 500s for an hour.

🎯 Practice Questions

Q1.

A bank's payment service can't tolerate any in-flight version mixing during deploys. Which strategy fits, and why are rolling deploys problematic here?

Q2.

Your team wants canary deploys on AWS. Beyond the application code, what infrastructure piece must exist to shape traffic 1% → 10% → 50% → 100%?

Show Answer

A traffic-shaping load balancer or service mesh. On AWS, the common options are:

1. ALB weighted target groups (CodeDeploy can drive this).
2. App Mesh / Istio / Linkerd for fine-grained percentage routing.
3. API Gateway canary releases for serverless/REST APIs.

Without one of these, you can only do "all-or-nothing" deploys. The percentage-based shifting is the load balancer's job, not the application's.

Q3.

Why is "instant rollback" the killer feature of blue-green? Sketch what rolling-deploy rollback looks like in comparison.

Q4.

A canary at 5% reports 2× the error rate of v1. List three things you check before deciding to abort vs. continue.

💡 Sample size, error type, latency.

04

GitHub Actions Basics — Workflows, Jobs, Steps & Triggers

The mental model: workflow → jobs → steps. Triggered by push, PR, cron, or manual dispatch.

▾

How to explain to students

A workflow is one YAML file in .github/workflows/. It contains one or more jobs. Each job runs on a fresh runner (a clean VM). Jobs run in parallel by default; use needs: to make one wait for another. Inside a job, steps run sequentially.

The trigger is the on: key. Common options: push, pull_request, schedule (cron), workflow_dispatch (manual button), release. You can also filter by branch (branches: [main]) or path (paths: ['src/**']).

.github/workflows/ci.yml

on:

push:

branches: [main, develop]

pull_request:

paths: ['src/**', 'package*.json']

schedule:

- cron: '0 2 * * *' # nightly 2am UTC

workflow_dispatch: # manual "Run workflow" button

jobs:

lint:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- run: npm ci && npm run lint

test:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- run: npm ci && npm test

build:

needs: [lint, test] # waits for both

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- run: npm ci && npm run build

workflow job step on: push on: pull_request schedule workflow_dispatch needs:

🎯 Practice Questions

Q1.

Two jobs are defined at the top level with no needs: between them. Do they run in parallel or sequentially? On the same runner or different runners?

Show Answer

Parallel, on different runners. Each job spawns its own fresh ubuntu-latest VM (or whatever the runs-on says). They share no filesystem state by default — a file written in job A is invisible to job B unless you upload it as an artifact.

If you need a strict order, add needs: [job-a] to the dependent job. If you need to share files, use actions/upload-artifact + actions/download-artifact.

Q2.

Write the on: block for a workflow that runs on every push to main, every pull request targeting main, and at 3am UTC daily.

Q3.

Add path filtering so the workflow only runs when files under src/ or package*.json change. Why does this matter for monorepos?

Q4.

A teammate uses uses: actions/checkout@main. You convince them to switch to @v4. What's the security risk of pinning to a moving branch?

💡 Supply-chain attack — if the action's repo is compromised, you'll silently pull the malicious code.

05

Matrix Builds — Test Across Versions, OSes, & Browsers

One YAML block, dozens of parallel test runs

▾

How to explain to students

Matrix builds are how you say "run this job for every combination of these inputs". Most common: testing a Node.js library against Node 18, 20, and 22 simultaneously. Or a Python tool on Ubuntu, macOS, and Windows. GitHub spins up a runner per combination — all in parallel.

.github/workflows/test.yml — matrix

jobs:

test:

runs-on: ${{ matrix.os }}

strategy:

fail-fast: false # let all combos finish even if one fails

matrix:

os: [ubuntu-latest, macos-latest, windows-latest]

node: [18, 20, 22]

exclude:

- { os: windows-latest, node: 18 } # skip combos

steps:

- uses: actions/checkout@v4

- uses: actions/setup-node@v4

with: { node-version: ${{ matrix.node }} }

- run: npm ci && npm test

# Result: 8 parallel runs (3 OS × 3 node, minus 1 excluded)

✓ test (ubuntu-latest, 18) ✓ test (macos-latest, 18)

✓ test (ubuntu-latest, 20) ✓ test (macos-latest, 20) ✓ test (windows-latest, 20)

✓ test (ubuntu-latest, 22) ✓ test (macos-latest, 22) ✓ test (windows-latest, 22)

⚡

All in parallel

8 combos = 8 runners. Total time ≈ slowest single run.

🚫

exclude / include

Skip impossible combos or add specific extras to the matrix.

⏸️

fail-fast: false

Let all combos finish — gives you the full picture, not just the first failure.

💸

Mind the minutes

macOS runners cost 10× Linux. Don't matrix everything — be intentional.

🎯 Practice Questions

Q1.

Write a matrix that tests against Python 3.10, 3.11, and 3.12 on Ubuntu only. Include the matrix.python reference in actions/setup-python@v5.

Q2.

A 3×3×2 matrix runs 18 jobs but you only need 6 specific combinations. Which is more idiomatic — using exclude heavily, or rewriting with include?

💡 Read the GitHub docs for include — it lets you specify exact combos.

Q3.

Why is fail-fast: false usually the right choice for a matrix that tests cross-OS compatibility?

Show Answer

With fail-fast: true (the default), the first failing combination cancels all the others. You then know that one failed, but not whether the others would have passed or failed too.

For cross-OS / cross-version compatibility tests, you want the full picture: "Windows + Node 22 fails, but everything else passes" tells you it's a Windows-specific issue. Setting fail-fast: false lets every combo run to completion so you see all the data at once.

For deploy pipelines (where you only need one green run), the default fail-fast: true is correct — there's no value in burning compute on combos you'll cancel anyway.

06

Secrets, Environments & Reusable Workflows

How to keep credentials out of YAML and reuse logic across repos

▾

How to explain to students

Never paste secrets into YAML. GitHub provides three layers: repo-level secrets, environment-level secrets (with optional approval gates), and org-level secrets (shared across all repos). Always use ${{ secrets.NAME }} in workflows.

Environments are how you wire approval gates: a job targeting the production environment can require a human reviewer click "approve" before it runs. Reusable workflows let you DRY up shared logic — one "deploy.yml" reused by 10 microservices.

workflow with environments + secrets

jobs:

deploy-staging:

runs-on: ubuntu-latest

environment: staging # pulls staging-scoped secrets

steps:

- run: aws s3 sync ./dist s3://staging-bucket/

env:

AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}

AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

deploy-prod:

needs: deploy-staging

runs-on: ubuntu-latest

environment:

url: https://app.example.com

steps:

- run: aws ecs update-service ... # waits for human approval

Practical: A reusable workflow that 10 repos can call

.github/workflows/reusable-deploy.yml (in shared repo)

on:

workflow_call: # this is what makes it reusable

inputs:

service:

required: true

type: string

secrets:

aws-key: { required: true }

jobs:

deploy:

runs-on: ubuntu-latest

steps:

- run: aws ecs update-service --service ${{ inputs.service }}

# Caller workflow in a different repo

jobs:

deploy-api:

uses: my-org/shared-workflows/.github/workflows/reusable-deploy.yml@v1

with: { service: api }

secrets: { aws-key: ${{ secrets.AWS_KEY }} }

🎯 Practice Questions

Q1.

An engineer commits AWS_ACCESS_KEY=AKIA... directly into a workflow YAML and pushes. They realise the mistake 30 seconds later and force-push to remove it. Are they safe? Why or why not?

Show Answer

No, not safe. Even after a force-push, the original commit may still exist:

1. In GitHub's reflog for ~30 days.
2. In any clone any teammate or CI runner pulled before the rewrite.
3. In cached search-engine results if the repo is public.
4. Bots scrape commits in real time — automated AWS-key scanners detect leaks within minutes and abuse them.

Real fix: immediately rotate the credential in AWS (delete the IAM access key), generate a new one, store it in GitHub Secrets, and review CloudTrail for any unauthorized usage. Then add the secret-scanning push-protection rule so it can't happen again.

Q2.

Why use a GitHub environment ("production") rather than a regular repo secret for production credentials? What capabilities does an environment unlock?

Q3.

Sketch a reusable workflow your team would call from 5 different microservice repos. What inputs / secrets would it accept?

Q4.

A secret printed via echo $MY_SECRET in a step appears as *** in the logs. Does that mean it's safe? What if you base64-encoded it first and printed?

💡 GitHub masks the literal string only.

07

OIDC Federation — AWS Auth Without Long-Lived Keys

The modern, secure way to deploy from GitHub Actions to AWS — no AWS_SECRET_ACCESS_KEY in sight

▾

How to explain to students

The old way: create an IAM user, generate an access key, paste it into secrets.AWS_SECRET_ACCESS_KEY, hope nobody leaks it. The key never expires, and if it leaks, you have a 12-hour window to notice and rotate before it's abused.

The new way: OpenID Connect federation. GitHub mints a short-lived JWT for each workflow run; AWS verifies the JWT was signed by GitHub and matches a trusted repo+branch; AWS gives you a 15-minute STS credential. No long-lived secrets exist anywhere. This is the AWS-recommended pattern as of 2023+.

aws — set up the OIDC trust (one time)

# 1. Register GitHub as an OIDC provider in your AWS account

$ aws iam create-open-id-connect-provider \

--url https://token.actions.githubusercontent.com \

--client-id-list sts.amazonaws.com \

--thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1

# 2. Create an IAM role that trusts GitHub for a SPECIFIC repo + branch

// trust-policy.json

{

"Version": "2012-10-17",

"Statement": [{

"Effect": "Allow",

"Principal": { "Federated": "arn:aws:iam::123:oidc-provider/token.actions.githubusercontent.com" },

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"

},

"StringLike": {

"token.actions.githubusercontent.com:sub": "repo:my-org/my-app:ref:refs/heads/main"

}

}]

}

$ aws iam create-role --role-name gh-deployer \

--assume-role-policy-document file://trust-policy.json

.github/workflows/deploy.yml — using OIDC

permissions:

id-token: write # required to mint the JWT

contents: read

jobs:

deploy:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- name: Configure AWS via OIDC

uses: aws-actions/configure-aws-credentials@v4

with:

role-to-assume: arn:aws:iam::123456789012:role/gh-deployer

aws-region: eu-west-1

- run: aws s3 ls # now authenticated, no AWS_* secrets needed

🔐

No long-lived keys

15-min STS credentials minted per run — nothing to leak.

🎯

Repo + branch scoped

Trust policy locks the role to one specific repo and branch.

📜

Auditable in CloudTrail

Every assume-role call logs the GitHub run URL — full traceability.

🆔

id-token: write

Required permission. Forget it and the auth fails with no helpful error.

🎯 Practice Questions

Q1.

A workflow with OIDC fails with "Could not load credentials from any providers." The role ARN is correct. What's the most likely missing line?

Show Answer

The permissions: id-token: write block at the workflow or job level. Without it, GitHub will not mint the OIDC JWT for the runner, and aws-actions/configure-aws-credentials has no token to exchange.

Add at the top of the workflow:

permissions:

                  id-token: write

                  contents: read

Other common causes: trust policy sub condition doesn't match the repo / branch / environment that's actually running.

Q2.

Compare the blast radius of a leaked long-lived AWS access key vs a leaked GitHub OIDC token. Which costs more to rotate, and why?

Q3.

Write the StringLike condition that lets the role be assumed only from PRs targeting main in my-org/my-app.

💡 PRs use a different sub claim than branch pushes.

Q4.

Why is OIDC federation more secure than even a properly-scoped IAM user with a rotated key? What attack does it prevent that rotation doesn't?

08

Using AI to Write & Debug Workflows

Workflow YAML is the perfect AI task — finicky syntax, well-known patterns, lots of training data

▾

How to explain to students

YAML indentation is the #1 source of CI failures. AI is great at generating workflow scaffolds and debugging the cryptic errors, but it can also confidently invent action names that don't exist. Verify by clicking the action's GitHub link — if it 404s, the AI hallucinated.

bash — AI prompt examples

# ❌ Weak prompt

"Write a GitHub Actions workflow"

# ✅ Strong prompt

"Write a GitHub Actions workflow that:

- Runs on push to main and on every PR targeting main

- Has 3 parallel jobs: lint (eslint), test (jest --coverage), typecheck (tsc --noEmit)

- A 4th job that builds a Docker image, depends on all 3 above

- Uses OIDC to push the image to ECR (no long-lived AWS keys)

- Tags the image with the git SHA

- Pin every action to a specific @vN tag (no @main)

After the YAML, list 3 things that will break this workflow in production."

# Debug-by-AI: paste the EXACT error

Error: Resource not accessible by integration

→ Prompt: "GitHub Actions error 'Resource not accessible by integration'

when calling the GitHub API. What permissions key am I missing in my workflow?"

ChatGPT Claude GitHub Copilot Verify action exists Pin versions

🎯 Practice Questions

Q1.

An AI-generated workflow uses uses: super-actions/awesome-deployer@latest. List two red flags before you merge.

Show Answer

1. The action might not exist. AI sometimes invents plausible action names. Click the link github.com/super-actions/awesome-deployer — if 404, it's a hallucination.
2. @latest is not a valid GitHub Actions ref. Actions are pinned by branch, tag, or SHA — there is no automatic "@latest". Even if it resolved to "default branch", that's a supply-chain risk: anyone who compromises the action's repo gets RCE in your CI.

Fix: verify the action exists, read its README, then pin to a specific version tag (@v3) or — for high-security CI — a full commit SHA (@a1b2c3d...) so even tag retargeting can't compromise you.

Q2.

Take a 10-line workflow with broken indentation that fails with "YAML mapping values not allowed here". Write the AI prompt that would identify the exact line and fix.

Q3.

Why should you never paste your secrets.AWS_SECRET_ACCESS_KEY into an AI tool, even when asking for help?

09

Project: Full CI Pipeline — Test → Build → Deploy to AWS via GHA

The capstone — combine everything into one production-grade workflow

▾

How to explain to students

Walk through this on screen, then have students recreate it on their own Node app. The workflow combines: matrix testing, Docker build, OIDC AWS auth, ECR push, ECS deploy, environment-gated production approval. This is what a real team's main.yml looks like.

.github/workflows/main.yml — full pipeline

on:

push: { branches: [main] }

pull_request: { branches: [main] }

permissions:

id-token: write

contents: read

jobs:

# ── 1. Test (parallel matrix) ───────────────

test:

runs-on: ubuntu-latest

strategy:

matrix: { node: [18, 20, 22] }

steps:

- uses: actions/checkout@v4

- uses: actions/setup-node@v4

with: { node-version: ${{ matrix.node }}, cache: npm }

- run: npm ci && npm run lint && npm test

# ── 2. Build + push image to ECR (only on main) ─

build:

needs: test

if: github.ref == 'refs/heads/main'

runs-on: ubuntu-latest

outputs: { tag: ${{ steps.meta.outputs.tag }} }

steps:

- uses: actions/checkout@v4

- id: meta

run: echo "tag=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT

- uses: aws-actions/configure-aws-credentials@v4

with:

role-to-assume: arn:aws:iam::123:role/gh-deployer

aws-region: eu-west-1

- uses: aws-actions/amazon-ecr-login@v2

- run: |

docker build -t myapp:${{ steps.meta.outputs.tag }} .

docker tag myapp:${{ steps.meta.outputs.tag }} \

123.dkr.ecr.eu-west-1.amazonaws.com/myapp:${{ steps.meta.outputs.tag }}

docker push 123.dkr.ecr.eu-west-1.amazonaws.com/myapp:${{ steps.meta.outputs.tag }}

# ── 3. Deploy to production (gated) ──────────

deploy:

needs: build

runs-on: ubuntu-latest

environment: production # requires approval

steps:

- uses: aws-actions/configure-aws-credentials@v4

with: { role-to-assume: arn:aws:iam::123:role/gh-deployer, aws-region: eu-west-1 }

- run: aws ecs update-service \

--cluster prod --service myapp \

--force-new-deployment

🔁

Matrix on test only

Build + deploy run once — test runs N× across versions.

🚦

if: ref == main

PRs run tests but don't deploy. Only main triggers the build/deploy chain.

🔐

OIDC end-to-end

Zero long-lived AWS credentials in the workflow.

🛂

Production approval

environment: production requires a human click before deploy.

10

Quiz: GitHub Actions YAML + Trigger Events

5 MCQs + 2 fill-in-the-command questions

▾

Sample quiz questions (interactive)

Q1. Two top-level jobs with no needs: — how do they execute?

A

In parallel, on separate runners

B

Sequentially, on the same runner

C

Sequentially, on separate runners

D

Order is undefined

Q2. Which permission is required for OIDC authentication to AWS?

A

contents: write

B

id-token: write

C

packages: write

D

actions: write

Q3. A canary deployment routes 5% of traffic to v2 and shows 2× error rate. Best first action?

A

Push to 100% — bigger sample is better

B

Ignore and continue rollout

C

Hold the canary, inspect error type and sample size before deciding

D

Roll back the entire infrastructure

Q4. The strategy: rolling, blue-green, canary — which has the lowest deploy-time infra cost?

A

Rolling

B

Blue-green

C

Canary

D

They're identical

Q5. Why npm ci over npm install in CI?

A

It's faster only — no other difference

B

It's reproducible — fails if package-lock is out of sync, never mutates it

C

It installs devDependencies only

D

It runs npm audit automatically

Fill-in-the-command

Fill 1: Cron expression for "every weekday (Mon–Fri) at 9:00 AM UTC".

Fill 2: The full on: trigger that fires on every push to main and on the manual "Run workflow" button.

11

Assignment: Add a Test + Build Workflow to an Existing Repo

Take any of your earlier projects and bolt on a real CI pipeline

▾

How to explain to students

Frame as a hiring task: "Pick any of your existing GitHub repos. Add a CI workflow that runs lint + test + build on every PR. Add a green check to the README. You have a weekend." This is the single most common DevOps interview prompt for junior roles.

📋 Assignment Requirements

Pick any existing repo (yours or a fork). Must have at least 3 source files and a test command.
Create .github/workflows/ci.yml that triggers on push and pull_request to main
3 parallel jobs: lint, test, build (use needs: only where required)
Use a matrix to test against at least 2 versions of your runtime (Node 18+20, Python 3.11+3.12, etc.)
Cache dependencies via actions/setup-node (or equivalent) to keep CI under 2 minutes
Pin every action to a specific @vN tag — no @main or @latest
Add a status badge to your README that links to the workflow runs
Bonus: A 4th job that builds (but does not push) a Docker image, only on push to main
Bonus: Add Trivy scanning for the Docker image and fail on HIGH/CRITICAL CVEs
Bonus: Configure branch protection so PRs cannot merge without all 3 jobs passing

expected README badge

[![CI](https://github.com/USER/REPO/actions/workflows/ci.yml/badge.svg)](https://github.com/USER/REPO/actions/workflows/ci.yml)

# In a passing PR:

✓ lint (8s)

✓ test (Node 20) (24s)

✓ test (Node 22) (22s)

✓ build (16s)

All checks have passed — Mergeable

📊

Grading rubric

Workflow runs: 25pts. Matrix correct: 20pts. Caching works: 15pts. Pinned actions: 15pts. Badge in README: 10pts. Code quality: 15pts.

🎯

Common mistakes

Forgot cache: npm, used @main, jobs run sequentially because of unnecessary needs:, badge URL wrong.

💡

Stretch goal

Add OIDC + an environment-gated production deploy step that's actually wired to a real AWS account.