← Back to all topics
$ copilot suggest "fix this k8s ingress"

AI-Assisted DevOps
Instructor Guide

Use AI as a force multiplier — and keep the critical thinking. The pattern is: detailed prompt → AI draft → human verify → ship.

01
GitHub Copilot for Ops, Bash, Dockerfile & IaC
In-IDE inline suggestions are 80% of where AI shows up day-to-day

How to explain to students

GitHub Copilot sits inside your IDE (VS Code, JetBrains, Vim, Neovim) and suggests as you type. For DevOps work — bash, Dockerfile, Terraform, GitHub Actions YAML — it's particularly strong because the input is structured and the patterns are well-known. Type a comment describing what you want; Copilot completes the rest.

It is not magic. It works best when you've already loaded context: open the related files, write a clear comment header, name your variables well. "Garbage in, garbage out" applies. The single biggest skill upgrade is learning to write the comment first.

copilot inline — Dockerfile sample
# What you type:
# Multi-stage Dockerfile for TypeScript Node 20 API:
# - builder stage compiles src/ to dist/
# - final stage uses node:20-alpine, runs as non-root
# - only production deps in final image (npm ci --omit=dev)
# - HEALTHCHECK hits /healthz

# What Copilot suggests after pressing Tab:
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./
RUN npm ci --omit=dev
USER app
HEALTHCHECK --interval=30s CMD wget -q -O- localhost:3000/healthz || exit 1
CMD ["node", "dist/server.js"]

# Press Tab to accept, Esc to dismiss, Alt+] for next suggestion
⌨️
In-IDE inline
Tab to accept. Tied to your file context — open relevant files first.
📝
Write the comment first
Detailed comment → great suggestion. Empty file → generic boilerplate.
🌐
Strong on DevOps formats
Dockerfile, YAML, HCL, bash — Copilot knows the dialects.
🚪
Free for students
GitHub Education Pack includes Copilot. Use it.

🎯 Practice Questions

Q1.
Why does Copilot give better suggestions when you have related files open in the editor? What's it actually using as context?
Show Answer
Copilot's prompt to the underlying model includes your open file, the cursor position, and a window of nearby open files (file paths + relevant snippets). It does not have access to your entire repo — only what's loaded in the IDE.

Practical implications:
1. Open the test file before completing the function — Copilot will write code that satisfies the existing tests.
2. Open Dockerfile + package.json before writing compose.yaml — Copilot will use the right port + start command.
3. Keep the file count reasonable — every open file consumes context budget.
Q2.
A teammate writes # write a docker compose file and presses Tab. Copilot generates a generic 2-service stack. Why? What does the prompt need?
Q3.
Why is Alt+] (next suggestion) often more useful than the first suggestion?
02
Prompting AI to Debug Pipeline Errors
CI logs are dense, error messages cryptic, AI translates both — fast

How to explain to students

Pipeline errors are AI's natural habitat: they're text-heavy, the failure modes are well-documented, and the fix is usually a small config change. The pattern: (1) paste the FULL error + (2) the workflow file + (3) the question phrased as "what's the most likely cause and the cheapest fix?". AI gets to root cause in seconds for issues that would take 30 minutes of Googling.

Two anti-patterns to avoid: (a) "my pipeline is broken, help" — no context, useless answer. (b) Pasting just the last line of the error — the line above is usually the root cause.

debug-by-AI prompts
# Weak prompt
"my github actions is broken"

# Strong prompt — full error + workflow + question
"GitHub Actions workflow fails with this error:
Error: Resource not accessible by integration
Run gh release create v1.0.0
HTTP 403: Resource not accessible by integration
My workflow:
on: push
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: gh release create v1.0.0 --generate-notes
env: { GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} }
What's the most likely cause and the minimal fix?"

# AI replies: "Add 'permissions: { contents: write }' to the job
— the default GITHUB_TOKEN is read-only since GHA changed the default in 2023."

# Verify in 30 seconds: read GHA permissions docs, apply fix, push
# vs 30 minutes Googling the error message manually
prompt template
[CONTEXT] What system, what version, what stack
[ERROR] Paste the FULL error block, not just the last line
[CONFIG] Paste the relevant config file
[QUESTION] "Most likely cause + minimal fix?" or "Top 3 causes ranked"
[CONSTRAINT] Optional — what you already tried, what's off-limits

# Stack the prompt this way and AI hits root cause first time, ~80% of cases.

🎯 Practice Questions

Q1.
Take "Terraform plan failed" and turn it into a 4-section prompt that gets a useful AI response.
Q2.
Why pasting only the last line of an error often misleads the AI? Which line is usually the root cause?
Show Answer
CLI tools usually print errors as a chain — the last line is "the operation failed", and the actual cause is somewhere above in the trace.

Example:
npm ERR! code 1
npm ERR! path /app
npm ERR! command failed
npm ERR! command sh -c node-gyp rebuild
npm ERR! make: *** [Release/foo.o] Error 1
npm ERR! ../foo.cc:42: error: 'X' was not declared in this scope ← actual cause

Pasting only "code 1" gets a generic answer. Pasting the full block gets a fix. Default to 30 lines above the failure.
Q3.
Should you ever paste production logs to a public AI tool? List two scrub-this-first rules.
💡 PII, real account IDs, customer hostnames.
03
AI-Assisted Linux & Bash Troubleshooting
"What does this error mean?" + "Write me a one-liner for X" — daily DevOps wins

How to explain to students

Linux errors range from clear ("No such file") to cryptic ("too many open files", kernel oops messages). AI is excellent at the cryptic kind — translating into "what's broken, where, why."

For one-liners: AI is faster than recall. Need a find ... | xargs ... | awk ... chain to "delete all log files older than 7 days but keep the latest 2 per directory"? Five seconds with AI. Five minutes from memory + man pages.

linux-debugging via AI
# Cryptic error → AI translation
$ docker run myapp
exec /usr/local/bin/docker-entrypoint.sh: exec format error

→ AI: "Architecture mismatch. The image was built for amd64 but you're running
on arm64 (Apple Silicon). Either build for the right arch:
docker build --platform linux/amd64 -t myapp .
Or use buildx for multi-arch:
docker buildx build --platform linux/amd64,linux/arm64 -t myapp ."

# Bash one-liner generation
"Find every .log file under /var/log modified more than 7 days ago,
print its size + path, but skip the latest 2 per directory."

# AI returns:
$ find /var/log -type d | while read dir; do
  find "$dir" -maxdepth 1 -name '*.log' -mtime +7 \
    -printf '%T+ %s %p\n' | sort -r | tail -n +3 | awk '{print $2, $3}'
done

# Always run with --dry-run / list-only first; verify behaviour before chaining rm

# Error walkthrough: explain me this
"Walk me through what each part of this command does:
awk 'NR==2 {print $5}' /etc/passwd | tr ':' '\n' | sort -u
Explain it as if I'm 6 months into Linux."
explain-this one-liner cryptic-error --dry-run first

🎯 Practice Questions

Q1.
An AI gives you a destructive bash one-liner (rm -rf, find ... -exec rm, truncate). What's the safety routine before running it?
Show Answer
Three-step routine:
1. Read every flag. If you don't know what a flag does, look it up in man first. AI is fluent in obscure flags you might not know.
2. Replace the destructive verb with a safe one. -exec rm-print. truncate -s 0ls -la. Run the modified version. Inspect the list.
3. Re-add the destructive verb only after you've eyeballed the list. If anything looks off — anything — abort.

Bonus: in production-like envs, run on one host first, then expand. "Try it on staging" is the cheap save.
Q2.
"explain-this" prompts: paste a complex one-liner and ask AI to walk through it. Why is this a great learning loop?
Q3.
An AI generates a script using grep -P for Perl regex. It works on Mac but fails on Alpine in Docker. What's the explanation, and what's the more portable alternative?
💡 BSD grep vs GNU grep vs busybox grep.
04
Where AI Helps and Where It Costs Critical Thinking
An honest map. Lean on AI for the cheap; do not let it think for you on the expensive.

How to explain to students

AI is brilliant at generation (write me X) and translation (explain this). It is shaky on judgement (should we do X?) and weak on novel architecture (something not heavily represented in training data). The trap is using AI for judgement work because it produces a confident-sounding answer either way.

A senior engineer who relies on AI for judgement plateaus. A junior who uses AI for generation while practising judgement themselves accelerates. The goal of every prompt should ideally include "explain why" — so you build the muscle alongside the output.

where-ai-helps
AI You
──────────────────────── ──────────────── ──────────────
Generate Dockerfile ✓ first draft review + verify
Translate cryptic error ✓ very good validate
Explain unfamiliar code ✓ great confirm
Write awk/jq one-liner ✓ instant --dry-run first
Refactor a function ✓ usually right test coverage
Pick architecture ✗ surface-level YOUR CALL
Decide blue-green vs canary ✗ context-blind YOUR CALL
Triage a real incident ✗ no situational YOUR CALL
Convince a stakeholder ✗ generic prose YOUR CALL
Mentor a junior ✗ no continuity YOUR CALL

# Dangerous patterns to spot in yourself
- Accepting AI suggestions without reading them
- Using AI to skip the docs entirely
- Defending an AI-generated decision because "the AI said so"
- Skipping mental practice — only typing prompts, never solving
- Trusting AI's confidence rather than its evidence

# Healthy patterns
+ "Why does this work?" prompts after every accepted suggestion
+ AI for the first 80% (scaffold), human for the last 20% (judgement)
+ Re-derive a small piece of the answer manually as a check
+ Pair-prompting: explain to AI, listen to feedback, iterate
⚙️
Generate, then verify
AI saves typing. You still own the verify.
🎯
Judgement = you
Architecture, trade-offs, stakeholder pushback — practice these, don't outsource.
🧠
Practice retrieval
Type the answer from memory occasionally. Atrophy is real.
📚
Read primary docs
AI summarises. Docs are authoritative. Mix both.

🎯 Practice Questions

Q1.
A junior says "I don't read AWS docs anymore — I just ask Claude." List two skills they're failing to build.
Q2.
Sketch a 30-second self-check before clicking "merge" on an AI-generated PR.
Show Answer
A useful checklist:
1. Can I explain every change in 1 sentence? If not, I don't understand it.
2. What test would catch this if it broke? Does that test exist?
3. What's the rollback path? One command? Many manual steps?
4. What did I assume the AI knew that it might not? (e.g. "this is the v5 AWS provider, not v4").
5. Would I have done it the same way without AI? If yes, ship. If no, write down why — that's where you learn.

30 seconds is enough. Skip it and you'll merge an AI-generated chmod 777 within a year.
Q3.
Pick a small thing you've done with AI in the last 24h. Re-do it from memory. What did you forget? That's your skill gap.
05
Agentic IDEs — Cursor, Claude Code & Aider
Beyond inline suggestions — let AI edit many files at once, with you as reviewer

How to explain to students

2024–2026 saw the rise of agentic IDEs. Where Copilot suggests one line at a time, agentic tools take a goal ("add OIDC authentication to this CI workflow"), read your codebase, plan a multi-file edit, run terminal commands, and present a diff for you to approve. The leap is from "auto-complete" to "auto-PR".

Two major categories: IDE-native (Cursor, JetBrains AI Assistant, GitHub Copilot Workspace) and CLI-native (Claude Code, Aider, Continue). Both have their place. CLI-native tools are particularly strong for DevOps because much of the work happens at the shell.

claude-code — agentic CLI session
$ claude
Welcome to Claude Code. Working directory: ~/projects/myapp

> "Add OIDC AWS auth to the existing GitHub Actions deploy workflow.
No long-lived AWS_* secrets. Add a permissions block."

[Reading: .github/workflows/deploy.yml]
[Reading: README.md]
[Plan]
1. Add permissions: { id-token: write, contents: read } at workflow level
2. Replace AWS_ACCESS_KEY_ID/SECRET steps with
aws-actions/configure-aws-credentials@v4 + role-to-assume
3. Update README's "secrets needed" section

[Diff preview — apply? y/n]
.github/workflows/deploy.yml: -8 +6
README.md: -2 +3

> y
[Applied. Run terraform plan + push to test? y/n]
> just push to a feature branch and open PR for review

[Created branch: ai/oidc-migration]
[Opened PR #234 with full diff + this conversation as context]
Cursor Claude Code GH Copilot Workspace Aider Continue multi-file edit

🎯 Practice Questions

Q1.
Compare "inline-suggestion Copilot" vs "agentic Claude Code" — when is each the right tool?
Q2.
Why is requiring a PR-review checkpoint critical when an agentic tool edits multiple files at once?
Show Answer
The blast radius of agentic tools is much larger than inline. A single approve can change 10 files, run 5 terminal commands, and update 2 cloud resources. Without a PR checkpoint:

1. Subtle errors slip in — a wrong region, an over-permissive IAM, a typo in a healthcheck path. Each looks fine alone; combined they break prod.
2. You lose the learning loop — by reviewing the diff, you absorb how the change works. Skipping that means you can't reason about it later when it breaks.
3. Audit trail — a PR with the AI conversation attached is searchable, blameable, revertable. A direct apply is none of those.

Treat agentic output like an intern's PR — high value, must review.
Q3.
Set up Aider or Continue in your editor for one of your earlier projects. Use it for a 30-minute task. Reflect on where it shone vs failed.
06
Project: Use AI to Refactor a Workflow + Add a Feature
Real-world AI-pair-programming on a repo of your choice — and document the workflow that worked

How to explain to students

Pick one of the earlier projects (Docker, CI/CD, AWS) and use AI to do two real tasks: (1) refactor something already there (e.g. "convert this CI workflow to use OIDC"), (2) add a new feature (e.g. "add a Slack notification on deploy failure"). Document the prompts that worked, the false starts, and the verification steps. The artefact: a real PR + a short post-mortem write-up.

project-flow
# Suggested project layout
myapp/
├── .github/workflows/deploy.yml ← refactor target
├── ai-session.md ← prompt + responses log
└── retro.md ← what worked, what didn't

# ai-session.md — what to log
## Task 1: refactor deploy.yml to use OIDC
Prompt 1: [exact text I sent]
Response: [TL;DR what AI returned]
Verdict: applied / partially applied / discarded — why

## Task 2: add Slack notification on failure
Prompt: ...
Response: ...
Verdict: ...

# retro.md — short reflection
- What did AI nail first time?
- Where did it hallucinate / use deprecated args?
- How long did the task take vs estimate without AI?
- What did I learn about prompting that I'll keep doing?
📝
Log every prompt
Forces self-awareness. Builds your personal prompt library.
🔍
Verdict per response
"Applied / partial / discarded — why" trains your judgement.
⏱️
Time the task
Compare to your estimate without AI. Calibrates expectation.
🧪
Verify in CI
PR must pass CI. AI-generated code goes through the same gates as human code.
07
Quiz: Where AI Helps and Where It Costs Critical Thinking
5 MCQs + 2 fill-in-the-prompt questions

Sample quiz questions (interactive)

Q1. AI is consistently strong at:
A
Picking the right cloud architecture for your business
B
Generating Dockerfile / YAML / HCL drafts and translating cryptic errors
C
Triaging a real production incident in your stack
D
Understanding stakeholder politics
Q2. The most common AI failure mode in DevOps:
A
Refusing to answer security questions
B
Confidently inventing fake action names / deprecated arguments
C
Slow response times
D
Output truncation
Q3. The single biggest predictor of useful AI output:
A
Picking the right model
B
Quality of the prompt — context + constraints + verification ask
C
Long token limit
D
Internet access for the model
Q4. Inline Copilot vs agentic Claude Code — the right tool for "rename a variable across the file I'm editing" is:
A
Inline Copilot — faster, fewer dependencies
B
Agentic Claude Code — always
C
Either, identical results
D
Neither — manual edit
Q5. Why is "ask AI to explain what each part does" a great learning loop?
A
Builds your mental model alongside the output
B
Generates more accurate output
C
Always finds bugs
D
Required by GitHub Actions

Fill-in-the-prompt

Fill 1: Acronym for the "GH Actions" workflow runner provider's auth method to AWS without long-lived keys.
Fill 2: A single keyword for the AI failure mode of inventing plausible-looking facts.
08
Assignment: Use AI to Write a Lint/Format CI for a Real Service
A bounded, real-world task to practise prompt → draft → verify → ship

How to explain to students

Frame as a real task: "Pick any GitHub repo you maintain (or fork a public one). Use AI to add a CI job that lints + formats the code. Submit a PR. Document every prompt + verdict. Total time budget: 90 minutes."

📋 Assignment Requirements

  • Pick a real repo: yours or a fork. Must have at least one source language (TS/JS, Python, Go, Bash, HCL — any).
  • Add a .github/workflows/lint.yml that runs on pull_request: lint + format-check + (if relevant) typecheck
  • Use exactly one AI tool (Copilot, Claude, ChatGPT, Cursor, Claude Code — your choice). Document which.
  • Maintain an ai-log.md: every prompt + 1-line summary of the response + your verdict (applied / edited / discarded)
  • Final workflow must fail the build on any lint or format violation
  • Pin every action to @vN — no @main
  • Demonstrate end-to-end: push a deliberately badly-formatted commit on a branch; show the workflow blocking it via a screenshot
  • Write a 1-page retro.md covering: what AI did well, where it hallucinated, total time spent, what you'd prompt differently next time
  • Open the PR in the original repo (or your fork's main branch). Include the ai-log.md + retro.md alongside the workflow.
  • Bonus: Add a pre-commit hook mirror so contributors fail fast locally
  • Bonus: Use 2 different AI tools side-by-side and compare outcomes for the same prompt
expected ai-log.md sample
# AI Session Log — Lint/Format CI
Tool: GitHub Copilot Chat (in VS Code)
Total time: 47 minutes

## Prompt 1 — initial scaffold
"Write a GitHub Actions workflow .github/workflows/lint.yml that runs
on pull_request to main. 3 jobs: eslint, prettier --check, tsc --noEmit.
Pin every action to a specific vN tag. Use Node 20."
Response: produced workflow with 3 jobs as requested.
Verdict: applied with 1 edit — Copilot used actions/setup-node@v3,
I bumped to v4 (latest GA).

## Prompt 2 — type-check failed locally with "tsc not found"
[paste error]
"Why does the typecheck job fail? My package.json has typescript as a devDep."
Response: pointed out the job uses npm ci then runs tsc directly,
but npx tsc would find the local install. Suggested:
"run: npx tsc --noEmit" instead of "run: tsc --noEmit"
Verdict: applied verbatim. Worked.
📊
Grading rubric
Workflow runs + fails on bad commit: 30. ai-log.md substantive: 25. Verdicts thoughtful: 20. retro.md reflective: 15. Pinned actions: 10.
🎯
Common mistakes
Workflow runs but doesn't fail (lint warning-only); ai-log just lists prompts (no verdicts); retro is fluff ("AI is helpful").
💡
Stretch
Open the PR in someone else's open-source project — and get it merged. The most-impactful version of the assignment.