$ sudo apt install devops-knowledge

Linux for DevOps
Instructor Guide

Step-by-step explanations with real terminal examples for every module

01
Introduction to Linux & Why It Matters in DevOps
Setting the stage — what Linux is and where students will encounter it

How to explain to students

Start by asking: "When you visit Netflix, GitHub, or Google — what OS do you think is running those servers?" The answer is almost always Linux. Explain that over 96% of web servers run Linux. DevOps is about automating those servers, so Linux is non-negotiable.

Use the analogy: "Windows is like an automatic car. Linux is a manual — harder to learn, but gives you full control over every gear."

bash — why-linux.sh
# Where will students use Linux in DevOps?
# ✅ Docker containers → run on Linux
# ✅ AWS/GCP/Azure VMs → Linux by default
# ✅ CI/CD pipelines (GitHub Actions, Jenkins)
# ✅ Kubernetes nodes → Linux

$ uname -a
Linux devbox 5.15.0 #1 SMP x86_64 GNU/Linux

# This single line tells you kernel, hostname, architecture
🐧
Free & Open Source
No licensing costs. Companies save millions deploying Linux at scale.
Lightweight
Can run on minimal hardware — perfect for containers and cloud VMs.
🔧
Customizable
Every part of the OS can be tweaked, scripted, or automated.
🔒
Secure by design
Permissions, users, and groups are first-class citizens.

Practical: Inspecting a real server

When you SSH into a fresh cloud VM (AWS EC2, DigitalOcean, GCP), these are the first commands every DevOps engineer runs to "check the vitals".

bash — server-vitals.sh
# Which Linux distribution is this?
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="22.04.3 LTS (Jammy Jellyfish)"

# How much CPU and memory do we have?
$ lscpu | grep -E 'Model name|CPU\(s\)'
CPU(s): 4
Model name: Intel(R) Xeon(R) Platinum 8175M

$ free -h
total used free available
Mem: 7.6Gi 1.2Gi 3.4Gi 6.0Gi

# How much disk is available?
$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p1 30G 6.4G 23G 22% /

# Who am I and what's the hostname?
$ whoami && hostname && uptime
ubuntu
ip-10-0-1-42
10:24:07 up 14 days, 3:12, 1 user, load average: 0.08, 0.12, 0.10

🎯 Practice Questions

Q1.
Name three SaaS products you use daily that almost certainly run on Linux servers. For one of them, explain why the team likely chose Linux over Windows Server.
Q2.
Run uname -a on your machine. Identify the kernel version, machine hardware architecture, and operating system from the single line of output.
Q3.
Which file holds the Linux distribution name and version? Show the command to print only the PRETTY_NAME line from it.
Show Answer
The file is /etc/os-release. To extract just the pretty name:

grep PRETTY_NAME /etc/os-release

Or to strip the variable name and quotes:
. /etc/os-release && echo "$PRETTY_NAME"
Q4.
Explain the difference between a Linux distribution (Ubuntu, Fedora, Alpine) and the Linux kernel itself. Why do containers usually pick Alpine over Ubuntu?
💡 Think about image size and attack surface.
02
Terminal Navigation & File Management
Moving around the filesystem and managing files from the command line

How to explain to students

Compare the terminal to a GPS for your computer. pwd is "where am I?", ls is "what's around me?", and cd is "go there". Give students a real folder structure to navigate.

Live demo: Create a fake project structure on screen and let students replicate it. Nothing teaches mkdir and cp faster than actually using them.

bash — navigation demo
$ pwd  # Print current directory
/home/student

$ ls -la  # List all files with details
drwxr-xr-x 3 student student 4096 Apr 25 10:00 projects
-rw-r--r-- 1 student student 220 Apr 25 09:00 .bashrc

$ mkdir -p projects/webapp/config
$ cd projects/webapp
~/projects/webapp $ touch index.html app.js

$ cp index.html index.html.bak  # Backup before editing!
$ mv index.html.bak archive/
$ rm -rf archive/  # ⚠️ No recycle bin!
pwd ls -la cd .. mkdir -p cp / mv / rm ⚠️ rm -rf is permanent

Practical: Finding and inspecting files like a DevOps engineer

In production you almost never cd blindly — you find, grep, and tail your way to the answer. These four commands solve 80% of "where is X?" and "what's happening right now?" questions.

bash — search-and-inspect
# Find every nginx config file under /etc
$ find /etc -name "*.conf" -path "*nginx*"
/etc/nginx/nginx.conf
/etc/nginx/conf.d/default.conf

# Find files modified in the last 24 hours
$ find /var/log -type f -mtime -1

# Search inside files for a string (recursive, with line numbers)
$ grep -rn "ERROR" /var/log/app/
/var/log/app/api.log:142:ERROR connection refused

# Watch a log live (Ctrl+C to stop) — every DevOps engineer's bread and butter
$ tail -n 50 -f /var/log/syslog

# Where does this binary live? Is it even installed?
$ which docker && whereis docker
/usr/bin/docker
docker: /usr/bin/docker /etc/docker /usr/share/man/man1/docker.1.gz

# Disk hog hunt: top 10 biggest files in /var
$ du -ah /var | sort -rh | head -n 10

🎯 Practice Questions

Q1.
Use find to locate every .conf file under /etc that was modified in the last 7 days. Show the command.
Q2.
What does cd - do? Try it after navigating between two different directories — describe what you observe.
Q3.
Print the last 50 lines of /var/log/syslog and follow new lines as they arrive. Write the exact one-line command.
Show Answer
tail -n 50 -f /var/log/syslog

-n 50 sets how many existing lines to print first; -f ("follow") keeps the file open and streams new lines as they're appended. Press Ctrl+C to stop.
Q4.
Create the directory tree project/{src,test,docs,config} in a single mkdir command (no loops, no semicolons).
Q5.
Move all .log files from the current directory into an archive/ folder while preserving their original modification timestamps. Why might preserving timestamps matter for an audit log?
💡 Plain mv already preserves mtime — but think about what changes if you used cp instead.
Q6.
You SSH into a server and the disk is full. Write a one-liner that finds the 5 largest files in /var, sorted from biggest to smallest, in human-readable form.
03
File Permissions, Users, Groups & SSH Keys
Linux security model — who can read, write, and execute what

How to explain permissions

Use the apartment analogy: A file is an apartment. The owner has a key. The group is like the floor — neighbors with shared access. Others are strangers. Permissions (rwx) decide what each can do.

Teach chmod with the numeric system. It clicks instantly once students see 7 = 4+2+1 = read+write+execute.

bash — permissions
# Read a permission string: -rwxr-xr--
# Position: [type][owner][group][others]
# r=4, w=2, x=1 → 7=rwx, 5=r-x, 4=r--

$ ls -l script.sh
-rw-r--r-- 1 student devs 512 Apr 25 script.sh

$ chmod 755 script.sh  # owner=rwx, group=r-x, others=r-x
$ ls -l script.sh
-rwxr-xr-x 1 student devs 512 Apr 25 script.sh

# SSH Key Setup
$ ssh-keygen -t ed25519 -C "student@devops"
Generating public/private ed25519 key pair.
Enter file: ~/.ssh/id_ed25519

$ ssh-copy-id student@server.example.com
Number of key(s) added: 1 ✓

# Now login without password:
$ ssh student@server.example.com
🔑
chmod 600 private keys
SSH private keys must be readable only by owner. SSH will refuse to work otherwise.
👥
useradd / usermod
Create users and add them to groups like sudo or docker.
🛡️
sudo vs su
sudo = one command as root. su = switch user entirely. Prefer sudo.

Practical: Real EC2 SSH workflow + troubleshooting

This is the exact sequence used to connect to an AWS EC2 instance — including the most common error beginners hit and how to fix it.

bash — ec2-ssh-workflow
# Download the .pem key from AWS Console and try to connect
$ ssh -i ~/Downloads/my-key.pem ubuntu@ec2-3-15-220-1.compute.amazonaws.com
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
Permissions 0644 for 'my-key.pem' are too open.
It is required that your private key files are NOT accessible by others.

# Fix: tighten permissions to 0600 (owner read/write only)
$ chmod 600 ~/Downloads/my-key.pem
$ ls -l ~/Downloads/my-key.pem
-rw------- 1 student student 1675 Apr 25 my-key.pem

$ ssh -i ~/Downloads/my-key.pem ubuntu@ec2-3-15-220-1.compute.amazonaws.com
Welcome to Ubuntu 22.04.3 LTS (GNU/Linux 6.2.0-1017-aws x86_64)

# Better: store it in ~/.ssh/config so you never type the long command again
$ cat >> ~/.ssh/config <<'EOF'
Host prod-api
HostName ec2-3-15-220-1.compute.amazonaws.com
User ubuntu
IdentityFile ~/.ssh/my-key.pem
EOF

$ ssh prod-api  # Now this is all you need!

# Bonus: change file ownership when copying app files
$ sudo chown -R www-data:www-data /var/www/html
$ sudo chmod -R 755 /var/www/html

🎯 Practice Questions

Q1.
Set deploy.sh so the owner can read/write/execute, the group can read/execute, and others have no access. What is the numeric permission and the chmod command?
Q2.
Why does SSH refuse to use a private key when its permissions are 0644? What is the minimum fix?
Show Answer
SSH enforces strict permissions on private keys to prevent other users on the same machine from reading them. 0644 is "world-readable" — anyone can copy the key. SSH refuses to load it as a security measure.

Fix: chmod 600 ~/.ssh/id_ed25519 (owner read/write only). The parent ~/.ssh folder must also be 700.
Q3.
Add the user deploy to the docker group so they can run docker without sudo, then verify the change. What two commands do you run?
Q4.
Explain what the sticky bit (the t in drwxrwxrwt on /tmp) does. What problem would exist on a multi-user server without it?
💡 Think about what happens when two users both have write access to the same directory.
Q5.
A teammate ran chmod -R 777 /var/www/html "to fix a permissions error." List three concrete reasons this is dangerous in production.
04
Package Management & System Updates
apt (Ubuntu/Debian) and yum/dnf (RHEL/CentOS) — installing and updating software

How to explain to students

Compare apt to the App Store — but for the terminal, and way faster. Ask students: "Imagine installing Photoshop just by typing one command." That's what package managers do.

Emphasize that in DevOps, servers must be kept updated for security. Outdated packages are a top cause of breaches. apt upgrade is like pressing "Update All" on your phone.

bash — package management
# === Ubuntu / Debian (apt) ===
$ sudo apt update  # Refresh package list
$ sudo apt upgrade -y  # Upgrade all packages
$ sudo apt install nginx git curl -y
$ sudo apt remove nginx
$ apt search python3  # Find packages

# === RHEL / CentOS (yum/dnf) ===
$ sudo yum update -y
$ sudo dnf install nginx -y  # dnf = modern yum

# Check if a service is running after install
$ sudo systemctl status nginx
● nginx.service - A high performance web server
Active: active (running) since Fri 2025-04-25
apt update apt upgrade apt install yum / dnf systemctl

Practical: Installing Docker from the official repository

Most production tools (Docker, Node.js, Postgres, Kubernetes) are not in the default Ubuntu repos — or the version there is too old. The standard pattern is: add GPG key → add repo → update → install. Memorise this flow once and you can install almost anything.

bash — install-docker-official.sh
# 1. Install prerequisites for adding HTTPS-based repos
$ sudo apt update
$ sudo apt install -y ca-certificates curl gnupg

# 2. Add Docker's official GPG key (proves the packages are authentic)
$ sudo install -m 0755 -d /etc/apt/keyrings
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
  sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# 3. Add the Docker repo to apt's source list
$ echo "deb [signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list

# 4. Refresh and install
$ sudo apt update
$ sudo apt install -y docker-ce docker-ce-cli containerd.io

# 5. Verify it actually works
$ docker --version
Docker version 26.0.0, build 2ae903e
$ sudo systemctl enable --now docker  # Start now + auto-start on boot

# Bonus: which package owns the `dig` command?
$ apt-file search bin/dig
dnsutils: /usr/bin/dig

🎯 Practice Questions

Q1.
Install Node.js 20 LTS on Ubuntu using the NodeSource repository. Outline the GPG-key + repo-add + install steps without copying from the Docker example above.
Q2.
Why must you run apt update before apt install? What concretely goes wrong if you skip it?
Show Answer
apt update refreshes the local package index — the list of "what packages and what versions exist in each configured repo."

If you skip it: apt may try to install a version that has since been removed from the mirror (404), or quietly install an older, vulnerable version because it doesn't know a newer one is available. After adding a new repo (like Docker's), apt update is mandatory — without it, apt has no idea the new repo's packages exist.
Q3.
On a fresh Ubuntu install, the dig command is missing. Find which package provides it (without Googling).
💡 The package apt-file or the command apt search can both help.
Q4.
What is the difference between apt remove pkg and apt purge pkg? Give a real scenario where you'd choose each.
Q5.
After upgrading a server, /boot fills up because old kernels keep accumulating. Which single command removes packages that are no longer needed?
05
Writing Bash Scripts for Automation
From one-liners to full automation scripts with variables, loops, and conditionals

How to explain to students

Tell students: "Every time you do something more than twice in the terminal, write a script." Scripts are just saved commands. Start with a simple "Hello, World" script, then build up to variables, if-else, and loops.

The shebang line #!/bin/bash is magic — explain it tells the OS which interpreter to use. Without it, the script might not run.

bash — first_script.sh
#!/bin/bash  # Always start with this!

# Variables
NAME="DevOps Student"
DATE=$(date +%Y-%m-%d)  # Command substitution
echo "Hello $NAME — today is $DATE"

# If-else
DISK=$(df / | awk 'NR==2{print $5}' | tr -d '%')
if [ "$DISK" -gt 80 ]; then
  echo "⚠️ Disk usage is ${DISK}% — WARNING!"
else
  echo "✅ Disk usage is ${DISK}% — All good"
fi

# Loop through files
for FILE in /var/log/*.log; do
  echo "Found log: $FILE"
done
📝
Always use quotes
Use "$VAR" not $VAR to avoid word-splitting bugs with spaces.
🧪
Test with set -x
Add set -x at the top to print each command as it runs — great for debugging.
🚨
Exit on error
Add set -e to stop the script immediately if any command fails.

Practical: A real "git pull and restart" deploy script

Many startups still deploy to a single VM with a script like this. It's small, real, and battle-tested — it pulls the latest code, installs dependencies, restarts the service, and reports status.

bash — deploy.sh
#!/bin/bash
set -euo pipefail  # strict mode

APP_DIR="/srv/myapp"
SERVICE="myapp"
BRANCH="${1:-main}"  # Default to main, override: ./deploy.sh staging

echo "→ Deploying $SERVICE on branch $BRANCH..."
cd "$APP_DIR"

git fetch --all --prune
git checkout "$BRANCH"
git pull origin "$BRANCH"

npm ci --omit=dev  # reproducible install, no devDeps
npm run build

sudo systemctl restart "$SERVICE"
sleep 2

if systemctl is-active --quiet "$SERVICE"; then
  echo "✅ $SERVICE deployed successfully ($(git rev-parse --short HEAD))"
else
  echo "❌ $SERVICE failed to start — rolling back"
  git reset --hard HEAD~1 && sudo systemctl restart "$SERVICE"
  exit 1
fi

🎯 Practice Questions

Q1.
Write a one-liner that counts how many .log files exist under /var/log recursively (including subdirectories).
Q2.
What is the difference between "$@" and "$*" in a bash script? Show a 3-line script that demonstrates the difference.
Q3.
Write a script that prints every argument passed to it, one per line, prefixed with its 1-based index. Example: ./args.sh foo bar prints 1: foo and 2: bar.
Show Answer
#!/bin/bash
i=1
for arg in "$@"; do
  echo "$i: $arg"
  ((i++))
done

The quoting "$@" matters — it preserves arguments that contain spaces. $@ without quotes would split "hello world" into two iterations.
Q4.
Convert a script you've written to use set -euo pipefail. Explain in one line each what -e, -u, and -o pipefail protect against.
Q5.
Write a function log() that prefixes every message with the current ISO-8601 timestamp and writes to both stdout and /var/log/myscript.log in a single call.
💡 The tee -a command can split a stream to a file and the terminal at the same time.
Q6.
The deploy script above uses npm ci --omit=dev. Why npm ci instead of npm install in a deployment context?
06
Using AI to Write & Debug Bash Scripts
Leveraging AI tools as a DevOps pair programmer

How to explain to students

Teach students that AI is a force multiplier, not a replacement. The goal is to know enough Linux to verify and modify what AI generates. Bad prompt = bad script. Good prompt = great starting point.

Show a before/after: a vague prompt gives a generic script, a detailed prompt gives a production-ready one. Teach students to always read, test, and understand AI output before running it on a server.

bash — AI prompting examples
# ❌ Weak prompt → weak script
"Write a bash script to monitor disk"

# ✅ Strong prompt → production-ready script
"Write a bash script that:
- Checks disk usage on / every 5 minutes
- Alerts if usage exceeds 85%
- Logs to /var/log/disk-monitor.log with timestamps
- Sends email using sendmail if available
- Handles errors and uses set -e, set -u"

# Debug tip: Paste error into AI with context
./monitor.sh: line 12: [: too many arguments
# Ask: "What does this bash error mean and how do I fix it?"
ChatGPT Claude GitHub Copilot Always review AI output Understand before running

Practical: Using AI to analyse a real log file

A common DevOps task: "Why did the API spike at 2am last night?" Instead of grepping by hand, you can paste a log snippet to AI and ask it to summarise. The trick is in the framing.

bash — ai-log-analysis prompt
# ✅ A great prompt for log analysis
"You are a senior SRE. Below is a 200-line nginx access log from a production
API server during a 5-minute window. Identify:
1. The top 5 IPs by request count
2. Any 5xx error patterns and which endpoints failed
3. Suspicious activity (unusual user agents, scraping, brute-force)
4. Suggest 3 mitigations I can apply with nginx or fail2ban
Format as a markdown table where helpful. Skip generic advice."

# Then provide actual log lines:
$ tail -n 200 /var/log/nginx/access.log | pbcopy

# Even better: extract structured data first, then ask AI to interpret
$ awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head
4218 198.51.100.42
873 203.0.113.7
642 198.51.100.99

# ⚠️ Never paste real production credentials, customer PII, or API keys
# Replace them with <PLACEHOLDER> before sending to any AI tool.

🎯 Practice Questions

Q1.
Take the vague prompt "write a backup script" and rewrite it as a 5-bullet detailed prompt that would produce a production-ready result. Each bullet should remove a specific ambiguity.
Q2.
Why should you never paste production database URLs or API keys into an AI tool, even a "secure" one?
Show Answer
Most AI tools log conversations for safety review, abuse detection, and (depending on the plan) model improvement. Your secret may be retained, accessible to support staff, or replayed via prompt-leakage attacks against another user. Even self-hosted tools persist conversations to disk — and that disk gets backed up.

Rule: always replace secrets with placeholders like <DB_URL> or <API_KEY> in your prompt, then substitute the real value locally before running the script.
Q3.
Write a prompt that asks the AI to explain a bash error message rather than fix it. Why is asking for an explanation often more useful than asking for a fix?
Q4.
An AI gives you a script that uses eval $USER_INPUT. Should you ship it? Why or why not?
💡 Search for "bash eval security" if you're unsure — this is a classic vulnerability.
07
Project: Disk Usage Monitor with Alerts
Real-world bash script that monitors disk, logs results, and sends alerts

How to explain to students

This is where everything comes together. Walk through the script line by line before asking students to build it. Emphasize: this is the kind of script that runs on real production servers at companies like Amazon and Netflix.

bash — disk_monitor.sh (complete project)
#!/bin/bash
set -euo pipefail  # Strict mode

# ── Configuration ──────────────────────
THRESHOLD=85
LOG_FILE="/var/log/disk_monitor.log"
ALERT_EMAIL="admin@company.com"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')

# ── Functions ───────────────────────────
log() { echo "[$TIMESTAMP] $*" | tee -a "$LOG_FILE"; }

check_disk() {
  local USAGE
  USAGE=$(df / | awk 'NR==2 {print $5}' | tr -d '%')
  log "Disk usage: ${USAGE}%"

  if [[ "$USAGE" -ge "$THRESHOLD" ]]; then
    log "🚨 ALERT: Disk at ${USAGE}% — threshold is ${THRESHOLD}%"
    echo "Disk alert on $(hostname)" | mail -s "Disk Alert" "$ALERT_EMAIL" 2>/dev/null || true
  else
    log "✅ OK — ${USAGE}% used"
  fi
}

# ── Main ────────────────────────────────
log "=== Disk Monitor Started ==="
check_disk

# Schedule with cron: */5 * * * * /usr/local/bin/disk_monitor.sh
📅
Add to cron
Run crontab -e and schedule this script every 5 minutes automatically.
📂
Log rotation
Mention logrotate — logs need to be rotated or they fill the disk you're monitoring!
🧩
Extend it
Challenge: add monitoring for CPU, memory, or check multiple mount points.

Practical: Extending the monitor with CPU + memory checks

Once disk monitoring works, students should layer on CPU and memory checks. Same shape, different commands. This is the modular pattern real observability scripts use — one function per metric.

bash — system_monitor.sh (extended)
check_memory() {
  local USED_PCT
  USED_PCT=$(free | awk '/Mem:/ {printf "%.0f", $3/$2 * 100}')
  log "Memory: ${USED_PCT}% used"
  [[ "$USED_PCT" -ge 90 ]] && log "🚨 Memory critical: ${USED_PCT}%"
}

check_cpu() {
  # 1-min load average vs CPU count
  local LOAD CORES
  LOAD=$(awk '{print $1}' /proc/loadavg)
  CORES=$(nproc)
  log "Load avg (1m): $LOAD on $CORES cores"
  awk -v l="$LOAD" -v c="$CORES" 'BEGIN{exit !(l/c > 0.8)}' \
    && log "🚨 CPU pressure: load $LOAD on $CORES cores"
}

check_services() {
  for svc in nginx docker; do
    if systemctl is-active --quiet "$svc"; then
      log "✅ $svc is active"
    else
      log "❌ $svc is DOWN"
    fi
  done
}

check_disk; check_memory; check_cpu; check_services

🎯 Practice Questions

Q1.
Extend disk_monitor.sh to also report which mount point is the culprit when usage is over the threshold. df can show all mounts — adapt the parsing.
Q2.
In the script, why use tee -a "$LOG_FILE" instead of >> "$LOG_FILE"? Both append. What does tee give you that >> doesn't?
Q3.
Convert this cron schedule to run only on weekdays (Mon–Fri) at 9 AM: */5 * * * *. Write the new cron expression.
Show Answer
0 9 * * 1-5 /usr/local/bin/disk_monitor.sh

Field-by-field: 0 minute, 9 hour, * any day-of-month, * any month, 1-5 Monday through Friday. Use crontab.guru to sanity-check expressions.
Q4.
Add log rotation for /var/log/disk_monitor.log using logrotate. Where does the config file live, and what's the minimum set of directives you need (rotate count, frequency, compression)?
Q5.
The cron job runs but the script silently does nothing. List three of the most common reasons (cron-specific gotchas).
💡 Hint: $PATH, the working directory, and where stdout goes when there's no terminal.
08
Quiz: Linux Commands & Permissions
10 MCQs + 2 fill-in-the-command questions

Sample quiz questions (interactive)

Q1. Which command shows the current working directory?
A
ls
B
cd
C
pwd
D
echo
Q2. What does chmod 644 file.txt mean?
A
Everyone can read and write
B
Owner: rw-, Group: r--, Others: r--
C
Owner: rwx, Group: r-x, Others: r--
D
Read-only for everyone
Q3. Which command installs a package on Ubuntu?
A
yum install nginx
B
brew install nginx
C
sudo apt install nginx
D
sudo get nginx
Q4. What does set -e do in a bash script?
A
Exit immediately if any command fails
B
Enable debugging output
C
Set an environment variable
D
Suppress all errors
Q5. Which SSH key type is recommended for new setups?
A
RSA 1024-bit
B
DSA
C
ed25519
D
ECDSA 256-bit

Fill-in-the-command

Fill 1: What command makes script.sh executable by the owner only?
Fill 2: Write the cron expression to run a script every day at 2:30 AM.
09
Assignment: Backup Script with Logging
Write a bash script that backs up a folder and logs all results

How to explain to students

Frame this as a real job task: "Your manager asks you to set up an automated nightly backup for the /var/www/html folder. It must log success or failure. You have 24 hours." This mirrors what junior DevOps engineers actually do in their first week.

📋 Assignment Requirements

  • Accept a source folder as a script argument (e.g., ./backup.sh /var/www/html)
  • Create a timestamped .tar.gz backup in /backups/ directory
  • Log success or failure with timestamps to /var/log/backup.log
  • Handle edge cases: source not found, backup dir doesn't exist, disk full
  • Use functions, variables, and proper error handling (set -e)
  • Bonus: Delete backups older than 7 days automatically
  • Bonus: Send an email alert on failure
bash — expected output when grading
$ ./backup.sh /var/www/html
[2025-04-25 02:30:01] Backup started for: /var/www/html
[2025-04-25 02:30:03] ✅ Backup created: /backups/html_2025-04-25_02-30-01.tar.gz
[2025-04-25 02:30:03] Size: 24M
[2025-04-25 02:30:03] Cleaning up backups older than 7 days...
[2025-04-25 02:30:03] ✅ Backup completed successfully

$ cat /var/log/backup.log
[2025-04-25 02:30:01] Backup started for: /var/www/html
[2025-04-25 02:30:03] ✅ Backup created: html_2025-04-25_02-30-01.tar.gz
📊
Grading rubric
Script runs: 30pts. Logging works: 25pts. Error handling: 25pts. Code quality: 20pts.
🎯
Common mistakes
Hardcoded paths, no quotes around variables, missing shebang, no error checking.
💡
Extend challenge
Add the cron job that runs this script automatically at 2 AM every night.

🎯 Practice Questions (Stretch)

Q1.
Extend the assignment to support incremental backups using rsync --link-dest. Why is this dramatically more space-efficient than full .tar.gz snapshots?
Q2.
Modify the script so it uploads every backup to an S3 bucket using the AWS CLI (aws s3 cp). What permissions must the EC2 instance role have?
Q3.
Schedule the backup to run nightly at 2:00 AM using cron. Write the full crontab line and the path you'd put it under.
Show Answer
Edit the user's crontab with crontab -e and add:

0 2 * * * /usr/local/bin/backup.sh /var/www/html >> /var/log/backup-cron.log 2>&1

Or for a system-wide schedule, drop a file at /etc/cron.d/backup with the same line plus a username field: 0 2 * * * root /usr/local/bin/backup.sh /var/www/html. Always redirect stdout+stderr to a log so silent failures are catchable.
Q4.
Your script suddenly fails because /backups is full. Make the script self-healing — before creating a new backup, ensure there's at least 1 GB free, deleting the oldest backups until there is.
Q5.
A backup that never gets restored is worthless. Write a companion verify.sh that picks the latest backup, untars it into a temporary directory, and confirms the file count matches the source.
💡 find … | wc -l on both sides + mktemp -d for the temp folder.