Lesson 12: CI/CD and DevOps
Learning Objectives
By the end of this lesson, you will be able to:
- Implement continuous integration pipelines that automatically build, test, and validate code changes
- Design deployment pipelines for continuous delivery and deployment to production environments
- Apply containerization using Docker to create consistent, portable application environments
- Use infrastructure as code to manage and provision infrastructure through automated, version-controlled configuration
- Adopt DevOps practices that foster collaboration between development and operations for faster, more reliable releases
Introduction
Traditional software development separated development and operations into silos, creating friction that slowed releases and increased failures. Developers “threw code over the wall” to operations, who struggled to deploy and maintain unfamiliar software. This handoff approach led to long deployment cycles, environment inconsistencies (“it works on my machine”), and finger-pointing when things went wrong [1].
DevOps emerged to break down these silos, fostering collaboration between development and operations through shared responsibilities, automated processes, and cultural change. The goal: enable teams to build, test, and release software rapidly, frequently, and reliably. Companies practicing DevOps deploy hundreds of times daily while maintaining high quality—something impossible with traditional approaches [2].
Continuous Integration (CI) and Continuous Deployment (CD) are DevOps’s technical foundations. CI automatically builds and tests every code change, catching integration problems immediately. CD extends CI by automatically deploying changes to production after passing tests. Together, CI/CD enables the rapid release cycles that define modern software development [3].
This lesson covers CI/CD pipelines, containerization with Docker, infrastructure as code, deployment strategies, and the cultural practices that make DevOps successful. Understanding these concepts is essential for modern software engineering—whether you’re building web applications, mobile apps, or distributed systems [4].
Core Content
Continuous Integration: Automated Build and Test
Continuous Integration means developers integrate code into a shared repository frequently—multiple times daily. Each integration triggers an automated build and test process that verifies the code works correctly [1].
Core CI Principles:
- Maintain a single source repository (Git, etc.)
- Automate the build (one command builds everything)
- Make builds self-testing (automated test suites)
- Everyone commits frequently (at least daily)
- Build and test quickly (< 10 minutes ideally)
- Test in a production-like environment
- Make it easy to get latest deliverables
- Everyone can see build results
- Automate deployment to test environments
Basic CI Pipeline:
# GitHub Actions example (.github/workflows/ci.yml)
name: CI Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements-dev.txt
- name: Run linters
run: |
flake8 src/ --max-line-length=100
black --check src/
mypy src/
- name: Run unit tests
run: |
pytest tests/unit -v --cov=src --cov-report=xml
- name: Run integration tests
run: |
pytest tests/integration -v
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
- name: Build application
run: |
python setup.py build
- name: Create artifacts
if: success()
run: |
python setup.py sdist bdist_wheel
- name: Upload artifacts
uses: actions/upload-artifact@v3
with:
name: dist-packages
path: dist/Jenkins Pipeline Example:
// Jenkinsfile
pipeline {
agent any
environment {
PYTHON_VERSION = '3.11'
}
stages {
stage('Checkout') {
steps {
checkout scm
}
}
stage('Install Dependencies') {
steps {
sh '''
python -m venv venv
. venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txt
'''
}
}
stage('Lint and Type Check') {
parallel {
stage('Flake8') {
steps {
sh '. venv/bin/activate && flake8 src/'
}
}
stage('Black') {
steps {
sh '. venv/bin/activate && black --check src/'
}
}
stage('MyPy') {
steps {
sh '. venv/bin/activate && mypy src/'
}
}
}
}
stage('Test') {
parallel {
stage('Unit Tests') {
steps {
sh '''
. venv/bin/activate
pytest tests/unit -v --junit-xml=test-results/unit.xml
'''
}
}
stage('Integration Tests') {
steps {
sh '''
. venv/bin/activate
pytest tests/integration -v --junit-xml=test-results/integration.xml
'''
}
}
}
}
stage('Build') {
steps {
sh '''
. venv/bin/activate
python setup.py bdist_wheel
'''
}
}
stage('Archive Artifacts') {
steps {
archiveArtifacts artifacts: 'dist/*.whl', fingerprint: true
junit 'test-results/*.xml'
}
}
}
post {
failure {
mail to: '[email protected]',
subject: "Build Failed: ${env.JOB_NAME} - ${env.BUILD_NUMBER}",
body: "Build failed. Check console output at ${env.BUILD_URL}"
}
}
}CI Best Practices:
- Fail fast: Run quick tests first (linting, unit tests)
- Fail loud: Notify team immediately when builds break
- Keep builds fast: Optimize to stay under 10 minutes
- Fix broken builds immediately: Don’t commit on broken builds
- Run tests locally first: Don’t use CI as your first test run
Continuous Deployment: Automated Release
Continuous Deployment extends CI by automatically deploying code that passes all tests to production. CD means every commit that passes the pipeline goes live [3].
Deployment Pipeline Stages:
# GitLab CI/CD example (.gitlab-ci.yml)
stages:
- build
- test
- deploy_staging
- test_staging
- deploy_production
variables:
DOCKER_IMAGE: registry.example.com/myapp
build:
stage: build
script:
- docker build -t $DOCKER_IMAGE:$CI_COMMIT_SHA .
- docker push $DOCKER_IMAGE:$CI_COMMIT_SHA
unit_tests:
stage: test
script:
- docker run $DOCKER_IMAGE:$CI_COMMIT_SHA pytest tests/unit
integration_tests:
stage: test
script:
- docker run $DOCKER_IMAGE:$CI_COMMIT_SHA pytest tests/integration
deploy_staging:
stage: deploy_staging
script:
- kubectl set image deployment/myapp myapp=$DOCKER_IMAGE:$CI_COMMIT_SHA -n staging
- kubectl rollout status deployment/myapp -n staging
environment:
name: staging
url: https://staging.example.com
smoke_tests_staging:
stage: test_staging
script:
- curl -f https://staging.example.com/health || exit 1
- pytest tests/smoke --base-url=https://staging.example.com
deploy_production:
stage: deploy_production
script:
- kubectl set image deployment/myapp myapp=$DOCKER_IMAGE:$CI_COMMIT_SHA -n production
- kubectl rollout status deployment/myapp -n production
environment:
name: production
url: https://example.com
when: manual # Requires manual approval
only:
- main # Only deploy from main branchDeployment Strategies:
Blue-Green Deployment:
# Two identical environments: blue (current) and green (new)
deploy_blue_green:
script:
# Deploy to green environment
- kubectl apply -f k8s/deployment-green.yml
- kubectl wait --for=condition=available deployment/myapp-green
# Run smoke tests on green
- pytest tests/smoke --base-url=https://green.example.com
# Switch traffic from blue to green
- kubectl patch service myapp -p '{"spec":{"selector":{"version":"green"}}}'
# Keep blue environment for quick rollback if neededCanary Deployment:
# Gradually shift traffic to new version
deploy_canary:
script:
# Deploy canary with 10% traffic
- kubectl apply -f k8s/deployment-canary.yml
- kubectl set weight deployment/myapp-canary 10
# Monitor metrics for 15 minutes
- ./scripts/monitor_canary.sh --duration=15m
# If metrics good, increase to 50%
- kubectl set weight deployment/myapp-canary 50
- ./scripts/monitor_canary.sh --duration=15m
# If still good, full rollout
- kubectl set weight deployment/myapp-canary 100Rolling Deployment:
# Gradually replace old pods with new ones
deploy_rolling:
script:
- |
kubectl set image deployment/myapp \
myapp=$DOCKER_IMAGE:$CI_COMMIT_SHA \
--record
- kubectl rollout status deployment/myapp
- kubectl rollout history deployment/myappContainerization with Docker
Docker packages applications with their dependencies into containers—lightweight, portable units that run consistently across environments [2].
Dockerfile Example:
# Multi-stage build for optimized images
FROM python:3.11-slim as builder
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt
# Final stage
FROM python:3.11-slim
WORKDIR /app
# Copy only necessary files from builder
COPY --from=builder /root/.local /root/.local
COPY src/ ./src/
COPY config/ ./config/
# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH
# Create non-root user
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
# Expose port
EXPOSE 8000
# Run application
CMD ["python", "-m", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]Docker Compose for Local Development:
# docker-compose.yml
version: '3.8'
services:
web:
build: .
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgresql://postgres:password@db:5432/myapp
- REDIS_URL=redis://redis:6379
depends_on:
- db
- redis
volumes:
- ./src:/app/src # Mount code for live reload
command: python -m uvicorn src.main:app --reload --host 0.0.0.0
db:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=password
- POSTGRES_DB=myapp
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
redis:
image: redis:7-alpine
ports:
- "6379:6379"
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- web
volumes:
postgres_data:Infrastructure as Code
Infrastructure as Code (IaC) manages and provisions infrastructure through machine-readable definition files rather than manual configuration [4].
Kubernetes Deployment:
# k8s/deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: registry.example.com/myapp:latest
ports:
- containerPort: 8000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: myapp-secrets
key: database-url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: myapp
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 8000
type: LoadBalancer
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Terraform for Cloud Infrastructure:
# main.tf
provider "aws" {
region = "us-east-1"
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "myapp-vpc"
}
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "myapp-public-${count.index}"
}
}
resource "aws_ecs_cluster" "main" {
name = "myapp-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
}
resource "aws_ecs_task_definition" "app" {
family = "myapp"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = "256"
memory = "512"
container_definitions = jsonencode([{
name = "myapp"
image = "registry.example.com/myapp:latest"
portMappings = [{
containerPort = 8000
protocol = "tcp"
}]
environment = [
{
name = "ENV"
value = "production"
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = "/ecs/myapp"
"awslogs-region" = "us-east-1"
"awslogs-stream-prefix" = "ecs"
}
}
}])
}Monitoring and Observability
DevOps requires comprehensive monitoring to ensure system health and quickly diagnose issues [2].
Prometheus Metrics:
# app/metrics.py
from prometheus_client import Counter, Histogram, Gauge, generate_latest
import time
# Counters
request_count = Counter(
'http_requests_total',
'Total HTTP requests',
['method', 'endpoint', 'status']
)
error_count = Counter(
'http_errors_total',
'Total HTTP errors',
['endpoint', 'error_type']
)
# Histograms
request_duration = Histogram(
'http_request_duration_seconds',
'HTTP request latency',
['method', 'endpoint']
)
# Gauges
active_users = Gauge(
'active_users',
'Number of active users'
)
database_connections = Gauge(
'database_connections_active',
'Active database connections'
)
# Middleware to track metrics
async def metrics_middleware(request, call_next):
method = request.method
endpoint = request.url.path
start_time = time.time()
try:
response = await call_next(request)
status = response.status_code
request_count.labels(method=method, endpoint=endpoint, status=status).inc()
duration = time.time() - start_time
request_duration.labels(method=method, endpoint=endpoint).observe(duration)
return response
except Exception as e:
error_count.labels(endpoint=endpoint, error_type=type(e).__name__).inc()
raise
# Metrics endpoint
@app.get("/metrics")
async def metrics():
return Response(content=generate_latest(), media_type="text/plain")Logging Best Practices:
import logging
import json
from datetime import datetime
# Structured logging
class JSONFormatter(logging.Formatter):
def format(self, record):
log_data = {
'timestamp': datetime.utcnow().isoformat(),
'level': record.levelname,
'message': record.getMessage(),
'module': record.module,
'function': record.funcName,
}
if hasattr(record, 'user_id'):
log_data['user_id'] = record.user_id
if hasattr(record, 'request_id'):
log_data['request_id'] = record.request_id
if record.exc_info:
log_data['exception'] = self.formatException(record.exc_info)
return json.dumps(log_data)
# Usage
logger = logging.getLogger(__name__)
logger.info('User logged in', extra={'user_id': 123, 'request_id': 'abc-def'})Common Pitfalls
Pitfall 1: Manual Deployment Steps
Having manual steps in deployment processes defeats automation’s purpose and introduces human error.
Best Practice: Fully automate deployments. If something requires manual intervention, script it and add to the pipeline [3].
Pitfall 2: Insufficient Test Coverage in CI
Running only unit tests in CI misses integration issues that cause production failures.
Best Practice: Include unit, integration, and smoke tests in CI. Use test pyramid: many unit tests, some integration tests, few end-to-end tests [1].
Pitfall 3: No Rollback Strategy
Deploying without a quick rollback plan leaves you helpless when deployments fail.
Best Practice: Implement rollback mechanisms (blue-green, previous container versions). Practice rollbacks regularly [3].
Pitfall 4: Ignoring Monitoring
Deploying without monitoring means you don’t know when things break until users complain.
Best Practice: Implement comprehensive monitoring with metrics, logs, and alerts. Monitor application and infrastructure health [2].
Summary
CI/CD and DevOps transform software delivery from slow, error-prone manual processes to fast, reliable automated pipelines. Continuous Integration automatically builds and tests every code change, catching issues immediately. Continuous Deployment extends CI by automatically releasing changes to production. Docker provides consistent, portable containers. Infrastructure as code manages infrastructure through version-controlled configuration. Monitoring ensures visibility into system health. Together, these practices enable rapid, reliable software delivery that defines modern development.
Practice Quiz
Question 1: Your CI pipeline takes 45 minutes to run, so developers don’t run it before pushing. What’s wrong, and how should you fix it?
Answer: The pipeline is too slow. CI should run in under 10 minutes ideally [1].
Solutions:
- Parallelize: Run tests in parallel
- Optimize: Use caching for dependencies
- Prioritize: Run fast tests first (linting, unit tests), slow tests later
- Separate: Run comprehensive tests nightly, essential tests on every commit
Question 2: Your team deploys to production weekly on Fridays. Why is this problematic from a DevOps perspective?
Answer: Infrequent deployments and Friday deployments violate DevOps principles [2].
Problems:
- Large batches of changes increase risk
- Friday deployments leave no time to fix issues
- Infrequent deployments make each one scary
DevOps approach:
- Deploy frequently (daily or more)
- Small batches reduce risk
- Never deploy Fridays (or have weekend support)
- Automate everything
- Enable quick rollbacks
Question 3: What’s the difference between continuous delivery and continuous deployment?
Answer:
- Continuous Delivery: Code is always deployable, but requires manual approval to deploy to production
- Continuous Deployment: Code automatically deploys to production after passing all tests [3]
Both require automated testing and deployment pipelines. The difference is the final human approval gate.
References
[1] Humble, J., & Farley, D. (2010). Continuous Delivery. Addison-Wesley. URL: https://www.oreilly.com/library/view/continuous-delivery-reliable/9780321670250/, Quote: “Continuous integration requires that every commit triggers an automated build and test process. This catches integration problems immediately rather than discovering them weeks later. CI works best with frequent small commits, fast builds under 10 minutes, and comprehensive automated tests.”
[2] Kim, G., et al. (2016). The DevOps Handbook. IT Revolution Press. URL: https://itrevolution.com/product/the-devops-handbook/, Quote: “DevOps breaks down silos between development and operations through shared goals, automated processes, and cultural change. This enables organizations to deploy hundreds of times daily while maintaining high quality and reliability.”
[3] Forsgren, N., Humble, J., & Kim, G. (2018). Accelerate. IT Revolution Press. URL: https://itrevolution.com/product/accelerate/, Quote: “Elite performers deploy on demand, multiple times per day, with lead times under one hour and change failure rates below 15%. They achieve this through comprehensive automation, continuous integration and deployment, and quick recovery through automated rollbacks.”
[4] Morris, K. (2020). Infrastructure as Code. O’Reilly Media. URL: https://www.oreilly.com/library/view/infrastructure-as-code/9781098114664/, Quote: “Infrastructure as Code manages infrastructure through machine-readable definition files rather than manual configuration. This enables version control, automated testing, and consistent, repeatable infrastructure provisioning across environments.”