Lesson 7: Clean Code and Best Practices

Learning Objectives

By the end of this lesson, you will be able to:

  1. Apply meaningful naming conventions that make code self-documenting and immediately understandable to other developers
  2. Implement consistent formatting standards using automated tools and style guides to maintain code readability across teams
  3. Identify code smells and apply appropriate refactoring techniques to improve code quality without changing behavior
  4. Apply the DRY principle effectively to eliminate duplication while avoiding premature abstraction
  5. Write functions and classes that are focused, testable, and follow the Single Responsibility Principle

Introduction

The difference between code that works and code that’s maintainable often determines a project’s long-term success. While any developer can write code that executes correctly, professional developers write clean code—code that is easy to read, understand, and modify. As Martin Fowler famously observed, “Any fool can write code that a computer can understand. Good programmers write code that humans can understand” [1].

Clean code matters because software is read far more often than it’s written. Studies suggest developers spend 10 times more time reading code than writing it [2]. When you write clean code, you’re not just creating instructions for computers—you’re communicating with future developers (including yourself six months from now). Code that’s clear and well-structured reduces bugs, accelerates feature development, and makes onboarding new team members dramatically easier.

The principles of clean code transcend specific languages or frameworks. Whether you’re writing Python, JavaScript, Java, or any other language, the fundamentals remain constant: meaningful names, small focused functions, elimination of duplication, and clear structure. These aren’t arbitrary aesthetic preferences—they’re proven practices that reduce cognitive load, minimize errors, and create software that stands the test of time [3].

This lesson explores the core principles that distinguish clean code from merely functional code. You’ll learn specific, actionable techniques for naming, formatting, refactoring, and organizing code. More importantly, you’ll understand the reasoning behind these practices and when to apply them. Clean code isn’t about following rigid rules—it’s about consistently making choices that prioritize clarity, maintainability, and professionalism.


Core Content

The Power of Meaningful Names

Naming is one of the hardest and most important aspects of programming. Good names make code self-documenting, reducing the need for comments and making the codebase navigable. Poor names create confusion, requiring developers to constantly decode what variables, functions, and classes actually represent [2].

Principles of Good Naming:

Use Intention-Revealing Names: A name should answer why something exists, what it does, and how it’s used. If a name requires a comment to explain it, the name isn’t good enough [2].

# Bad - what does 'd' represent?
d = 86400  # elapsed time in seconds

# Better - but still requires comment
elapsed = 86400  # elapsed time in seconds

# Best - name reveals intention completely
elapsed_time_in_seconds = 86400
seconds_per_day = 86400

# Even better with descriptive constant
SECONDS_PER_DAY = 86400
elapsed_time = SECONDS_PER_DAY

Avoid Disinformation: Names shouldn’t hint at something they don’t represent. Don’t call something accounts_list unless it’s actually a list data structure. Don’t use similar names that differ only slightly [2].

# Bad - creates confusion
accounts_list = {'acct1': 1000, 'acct2': 2000}  # Actually a dict!
account_balance = get_accounts()  # Returns multiple accounts
ProcessData = process_data  # Looks like class but is function

# Good - accurate names
account_balances = {'acct1': 1000, 'acct2': 2000}
accounts = get_accounts()
process_data = process_data_function
ProcessData = ProcessDataClass

Make Meaningful Distinctions: Don’t add number series or noise words to satisfy uniqueness requirements. data1, data2 and info vs information provide no meaningful distinction [2].

# Bad - meaningless distinctions
def copy_chars(string1, string2):
    for char in string1:
        string2 += char

product_info = get_product()
product_data = get_product()

# Good - meaningful distinctions
def copy_chars(source, destination):
    for char in source:
        destination += char

product = get_product()
product_with_reviews = get_product_including_reviews()

Use Pronounceable and Searchable Names: Names should be easy to say in conversation and easy to find when searching the codebase [2].

# Bad - hard to pronounce, hard to search
genymdhms = datetime.now()  # generation year month day hour minute second
dt = 86400

# Good - pronounceable and searchable
generation_timestamp = datetime.now()
SECONDS_PER_DAY = 86400

Naming Conventions by Context:

Variables and Functions: Use descriptive nouns for variables, verbs for functions. Use lowercase with underscores (snake_case) in Python, camelCase in JavaScript/Java.

# Variables - nouns describing what they hold
user_count = 50
customer_email = "[email protected]"
is_active = True  # Boolean variables often start with is/has/can
has_permission = False
can_edit = True

# Functions - verbs describing what they do
def calculate_total(items):
    return sum(item.price for item in items)

def validate_email(email):
    return '@' in email and '.' in email

def fetch_user_by_id(user_id):
    return database.query(User).filter(User.id == user_id).first()

Classes: Use nouns or noun phrases with PascalCase. Class names should describe what the class represents or what objects of that class are [2].

# Good class names - nouns/noun phrases
class Customer:
    pass

class ShoppingCart:
    pass

class PaymentProcessor:
    pass

class UserAuthenticationService:
    pass

# Avoid verb-based class names
# class Process:  # Too vague
# class Manager:  # What does it manage?

Constants: Use ALL_CAPS with underscores to distinguish constants from variables.

# Constants - configuration values that don't change
MAX_RETRY_ATTEMPTS = 3
API_TIMEOUT_SECONDS = 30
DATABASE_CONNECTION_STRING = "postgresql://localhost/mydb"
DEFAULT_PAGE_SIZE = 20

Private Members: In Python, prefix with underscore to indicate internal use. In languages with explicit access modifiers, use private or protected keywords [4].

class Account:
    def __init__(self, balance):
        self._balance = balance  # Protected - internal use
        self.__transaction_log = []  # Private - name mangling

    def deposit(self, amount):
        self._balance += amount
        self.__log_transaction('deposit', amount)

    def __log_transaction(self, type, amount):  # Private method
        self.__transaction_log.append((type, amount))

Context-Specific Naming: Use domain language that stakeholders understand. If the business calls something an “account,” don’t call it a “user_record” in the code [2].

# If business domain uses specific terms, use them in code
class Policyholder:  # Insurance domain
    pass

class Beneficiary:  # Insurance domain
    pass

class Premium:  # Insurance domain
    pass

# This maintains ubiquitous language between code and business

Code Formatting and Structure

Consistent formatting makes code predictable and easy to scan. While specific styles vary by language and team, the key is consistency—use automated formatters to enforce standards without debate [3].

Vertical Formatting: Organizing Code Flow

Code should read like a well-written prose—top to bottom, with related concepts kept close together [2].

# Good vertical organization
class Order:
    """Represents a customer order"""

    def __init__(self, customer, items):
        """Initialize order with customer and items"""
        self.customer = customer
        self.items = items
        self.status = 'pending'
        self.created_at = datetime.now()

    # Public interface first
    def submit(self):
        """Submit order for processing"""
        self._validate_items()
        self._calculate_total()
        self._process_payment()
        self.status = 'submitted'

    def cancel(self):
        """Cancel the order"""
        if self.status == 'shipped':
            raise ValueError("Cannot cancel shipped order")
        self.status = 'cancelled'

    # Private helpers below, in order they're called
    def _validate_items(self):
        """Ensure all items are valid"""
        if not self.items:
            raise ValueError("Order must contain items")

    def _calculate_total(self):
        """Calculate order total"""
        self.total = sum(item.price * item.quantity for item in self.items)

    def _process_payment(self):
        """Process payment for order"""
        # Payment processing logic
        pass

Vertical Openness: Separate concepts with blank lines. Each blank line is a visual cue that a new concept is starting [2].

# Good vertical openness
def process_order(order_data):
    # Validation section
    if not order_data:
        raise ValueError("Order data required")
    if not order_data.get('customer_id'):
        raise ValueError("Customer ID required")

    # Data retrieval section
    customer = fetch_customer(order_data['customer_id'])
    items = fetch_items(order_data['item_ids'])

    # Business logic section
    order = create_order(customer, items)
    calculate_shipping(order)
    apply_discounts(order)

    # Persistence section
    save_order(order)
    return order.id

Horizontal Formatting: Keep lines reasonably short (80-120 characters). Break long expressions into multiple lines for readability [3].

# Bad - too long, hard to read
result = calculate_complex_value(parameter1, parameter2, parameter3, parameter4, parameter5, parameter6, parameter7) + additional_processing(another_long_parameter_name) * multiplier

# Good - broken into readable chunks
result = calculate_complex_value(
    parameter1,
    parameter2,
    parameter3,
    parameter4,
    parameter5,
    parameter6,
    parameter7
) + additional_processing(another_long_parameter_name) * multiplier

# Even better - use intermediate variables
base_value = calculate_complex_value(
    parameter1, parameter2, parameter3,
    parameter4, parameter5, parameter6, parameter7
)
additional_value = additional_processing(another_long_parameter_name)
result = base_value + additional_value * multiplier

Indentation and Alignment: Use consistent indentation (4 spaces in Python, 2-4 spaces in other languages). Proper indentation shows code structure at a glance [3].

# Good indentation shows structure
def process_payment(amount, payment_method):
    if amount <= 0:
        raise ValueError("Amount must be positive")

    if payment_method == 'credit_card':
        try:
            result = charge_credit_card(amount)
            if result.success:
                log_transaction(amount, 'credit_card')
                return result
            else:
                raise PaymentError(result.error_message)
        except NetworkError as e:
            log_error(f"Network error: {e}")
            raise
    elif payment_method == 'paypal':
        return charge_paypal(amount)
    else:
        raise ValueError(f"Unknown payment method: {payment_method}")

Use Automated Formatters: Tools like Black (Python), Prettier (JavaScript), or gofmt (Go) enforce consistent formatting automatically, eliminating debates about style [3].

# Configure in project (pyproject.toml for Black)
[tool.black]
line-length = 100
target-version = ['py39']

# Run formatter
# $ black my_code.py
# reformatted my_code.py

The DRY Principle: Don’t Repeat Yourself

The DRY principle states that “every piece of knowledge must have a single, unambiguous, authoritative representation within a system” [5]. When you duplicate code or logic, changes must be made in multiple places, increasing the chance of inconsistencies and bugs.

Identifying Duplication:

Duplication isn’t just identical code—it’s duplicate knowledge or intent. Look for:

  • Copy-pasted code blocks
  • Similar logic with minor variations
  • Repeated validation or calculation logic
  • Multiple places that must change together
# Bad - violates DRY
def create_user(username, email):
    if not username or len(username) < 3:
        raise ValueError("Username must be at least 3 characters")
    if not email or '@' not in email:
        raise ValueError("Invalid email")
    # Create user...

def update_user(user_id, username, email):
    if not username or len(username) < 3:
        raise ValueError("Username must be at least 3 characters")
    if not email or '@' not in email:
        raise ValueError("Invalid email")
    # Update user...

def register_admin(username, email):
    if not username or len(username) < 3:
        raise ValueError("Username must be at least 3 characters")
    if not email or '@' not in email:
        raise ValueError("Invalid email")
    # Register admin...

# Good - follows DRY
def validate_username(username):
    """Single source of truth for username validation"""
    if not username or len(username) < 3:
        raise ValueError("Username must be at least 3 characters")
    return username

def validate_email(email):
    """Single source of truth for email validation"""
    if not email or '@' not in email:
        raise ValueError("Invalid email")
    return email

def create_user(username, email):
    validate_username(username)
    validate_email(email)
    # Create user...

def update_user(user_id, username, email):
    validate_username(username)
    validate_email(email)
    # Update user...

def register_admin(username, email):
    validate_username(username)
    validate_email(email)
    # Register admin...

DRY Through Abstraction:

Sometimes duplication is more subtle. Look for patterns and extract them into reusable abstractions [5].

# Before - subtle duplication of pattern
def get_active_users():
    users = database.query(User).all()
    return [u for u in users if u.is_active]

def get_active_products():
    products = database.query(Product).all()
    return [p for p in products if p.is_active]

def get_active_orders():
    orders = database.query(Order).all()
    return [o for o in orders if o.status == 'active']

# After - extracted pattern
def get_active_entities(entity_class, active_check=None):
    """Generic function for fetching active entities"""
    entities = database.query(entity_class).all()

    if active_check is None:
        active_check = lambda e: getattr(e, 'is_active', False)

    return [e for e in entities if active_check(e)]

# Usage
active_users = get_active_entities(User)
active_products = get_active_entities(Product)
active_orders = get_active_entities(Order, lambda o: o.status == 'active')

When NOT to DRY: Avoid premature abstraction. If code looks similar but represents different concepts or might diverge in the future, duplication might be acceptable [6]. The Rule of Three suggests waiting until you have three instances before extracting an abstraction.

# These look similar but represent different concepts
def calculate_employee_bonus(employee):
    base_salary = employee.salary
    performance_score = employee.performance_rating
    return base_salary * performance_score * 0.1

def calculate_shipping_cost(order):
    base_price = order.subtotal
    distance_score = order.shipping_distance
    return base_price * distance_score * 0.1

# Don't force them into a single abstraction just because the formula looks similar
# They represent different business concepts and will likely diverge

Writing Clean Functions

Functions are the first line of organization in any program. Clean functions are small, do one thing, and do it well [2].

Small Functions: Functions should be small—typically 5-20 lines. If a function is longer, it probably does multiple things and should be split [2].

# Bad - function too long, does too much
def process_order(order_data):
    # Validation
    if not order_data:
        raise ValueError("Missing order data")
    if 'customer_id' not in order_data:
        raise ValueError("Missing customer")
    if 'items' not in order_data or not order_data['items']:
        raise ValueError("No items in order")

    # Fetch customer
    customer = database.query(Customer).filter(
        Customer.id == order_data['customer_id']
    ).first()
    if not customer:
        raise ValueError("Customer not found")

    # Calculate total
    total = 0
    for item_data in order_data['items']:
        item = database.query(Product).filter(
            Product.id == item_data['product_id']
        ).first()
        if not item:
            raise ValueError(f"Product {item_data['product_id']} not found")
        total += item.price * item_data['quantity']

    # Apply discount
    if customer.loyalty_level == 'gold':
        total *= 0.9
    elif customer.loyalty_level == 'silver':
        total *= 0.95

    # Create order
    order = Order(customer=customer, total=total)
    database.add(order)
    database.commit()

    # Send confirmation
    send_email(
        customer.email,
        "Order Confirmation",
        f"Your order total is ${total}"
    )

    return order.id

# Good - broken into focused functions
def process_order(order_data):
    """Process customer order"""
    validate_order_data(order_data)
    customer = fetch_customer(order_data['customer_id'])
    items = fetch_order_items(order_data['items'])
    total = calculate_order_total(items, customer)
    order = create_order(customer, items, total)
    send_order_confirmation(customer, order)
    return order.id

def validate_order_data(order_data):
    """Validate order data completeness"""
    if not order_data:
        raise ValueError("Missing order data")
    required_fields = ['customer_id', 'items']
    for field in required_fields:
        if field not in order_data:
            raise ValueError(f"Missing required field: {field}")
    if not order_data['items']:
        raise ValueError("Order must contain items")

def fetch_customer(customer_id):
    """Retrieve customer by ID"""
    customer = database.query(Customer).filter(Customer.id == customer_id).first()
    if not customer:
        raise ValueError(f"Customer {customer_id} not found")
    return customer

def fetch_order_items(item_data_list):
    """Retrieve and validate order items"""
    items = []
    for item_data in item_data_list:
        product = database.query(Product).filter(
            Product.id == item_data['product_id']
        ).first()
        if not product:
            raise ValueError(f"Product {item_data['product_id']} not found")
        items.append({'product': product, 'quantity': item_data['quantity']})
    return items

def calculate_order_total(items, customer):
    """Calculate order total with customer discount"""
    subtotal = sum(item['product'].price * item['quantity'] for item in items)
    discount_multiplier = get_customer_discount(customer)
    return subtotal * discount_multiplier

def get_customer_discount(customer):
    """Get discount multiplier based on loyalty level"""
    discounts = {'gold': 0.9, 'silver': 0.95, 'bronze': 1.0}
    return discounts.get(customer.loyalty_level, 1.0)

def create_order(customer, items, total):
    """Create and persist order"""
    order = Order(customer=customer, items=items, total=total)
    database.add(order)
    database.commit()
    return order

def send_order_confirmation(customer, order):
    """Send order confirmation email"""
    send_email(
        customer.email,
        "Order Confirmation",
        f"Your order #{order.id} total is ${order.total:.2f}"
    )

The refactored version has many small, focused functions. Each is easy to understand, test, and modify. The main process_order function reads like a table of contents, showing the process at a high level [2].

Single Responsibility: Each function should do one thing and do it well. If you can extract another function from a function, you should [2].

Function Arguments: Limit function arguments to 3 or fewer when possible. Many arguments suggest the function does too much or that related data should be grouped into an object [2].

# Bad - too many arguments
def create_user(first_name, last_name, email, phone, address, city, state, zip_code, country):
    pass

# Good - group related data
class Address:
    def __init__(self, street, city, state, zip_code, country):
        self.street = street
        self.city = city
        self.state = state
        self.zip_code = zip_code
        self.country = country

class UserData:
    def __init__(self, first_name, last_name, email, phone, address):
        self.first_name = first_name
        self.last_name = last_name
        self.email = email
        self.phone = phone
        self.address = address

def create_user(user_data: UserData):
    pass

No Side Effects: Functions should either answer a question or perform an action, not both. Avoid hidden side effects where functions modify state unexpectedly [2].

# Bad - side effect in query function
def check_password(username, password):
    user = find_user(username)
    if user.password == hash(password):
        initialize_session(user)  # Side effect!
        return True
    return False

# Good - separate query from command
def is_password_valid(username, password):
    """Query - no side effects"""
    user = find_user(username)
    return user.password == hash(password)

def login_user(username, password):
    """Command - performs action"""
    if is_password_valid(username, password):
        user = find_user(username)
        initialize_session(user)
        return True
    return False

Code Smells and Refactoring

Code smells are indicators that code might need refactoring. They’re not bugs—the code works—but they suggest structural problems that make the code harder to maintain [7].

Common Code Smells:

Long Method: Methods that are too long and do too much. Solution: Extract smaller methods [7].

Large Class: Classes with too many responsibilities. Solution: Split into focused classes following Single Responsibility Principle.

Duplicate Code: Same or similar code in multiple places. Solution: Extract common code into reusable functions or classes [5].

Long Parameter List: Functions with many parameters. Solution: Group related parameters into objects [2].

# Smell: Long parameter list
def create_report(title, author, date, data, format, include_charts, page_size, orientation):
    pass

# Refactored: Parameter object
class ReportConfig:
    def __init__(self, title, author, date, format='pdf',
                 include_charts=True, page_size='A4', orientation='portrait'):
        self.title = title
        self.author = author
        self.date = date
        self.format = format
        self.include_charts = include_charts
        self.page_size = page_size
        self.orientation = orientation

def create_report(data, config: ReportConfig):
    pass

Primitive Obsession: Using primitive types instead of small objects for domain concepts [7].

# Smell: Primitive obsession
def send_email(email_string):
    if '@' not in email_string:
        raise ValueError("Invalid email")
    # Send email...

# Refactored: Domain object
class Email:
    def __init__(self, address):
        if '@' not in address:
            raise ValueError("Invalid email")
        self.address = address

    def __str__(self):
        return self.address

def send_email(email: Email):
    # Email is already validated
    # Send to email.address

Feature Envy: A method that seems more interested in another class than the one it’s in [7].

# Smell: Feature envy
class Order:
    def __init__(self, customer):
        self.customer = customer
        self.items = []

    def calculate_discount(self):
        # This method is too interested in customer details
        if self.customer.loyalty_years > 5:
            return 0.2
        elif self.customer.loyalty_years > 2:
            return 0.1
        else:
            return 0.05

# Refactored: Move to appropriate class
class Customer:
    def __init__(self, loyalty_years):
        self.loyalty_years = loyalty_years

    def get_discount_rate(self):
        """Customer knows its own discount rate"""
        if self.loyalty_years > 5:
            return 0.2
        elif self.loyalty_years > 2:
            return 0.1
        return 0.05

class Order:
    def __init__(self, customer):
        self.customer = customer
        self.items = []

    def calculate_discount(self):
        return self.customer.get_discount_rate()

Refactoring Safely:

Refactoring means improving code structure without changing behavior. Follow these guidelines [7]:

  1. Have tests in place before refactoring to ensure behavior doesn’t change
  2. Make small changes and test after each change
  3. Commit frequently so you can roll back if needed
  4. Use IDE refactoring tools that automate safe transformations
  5. Refactor when adding features, not as a separate activity

Comments: When and How

Good code is self-documenting through clear names and structure. Comments should explain “why,” not “what” [2].

Good Comments:

# Good - explains intent behind algorithm choice
def calculate_checksum(data):
    """
    Use CRC32 instead of MD5 for performance.
    We don't need cryptographic security here,
    just quick error detection.
    """
    return zlib.crc32(data)

# Good - legal or contractual requirement
# Copyright (c) 2025 Company Name. All rights reserved.

# Good - warning about consequences
def delete_all_users():
    """
    WARNING: This operation is irreversible.
    All user data will be permanently deleted.
    """
    database.execute("DELETE FROM users")

# Good - TODO for future improvement
def calculate_shipping(order):
    # TODO: Add international shipping calculation (JIRA-123)
    return order.weight * 2.5

Bad Comments:

# Bad - states the obvious
i = 0  # set i to zero

# Bad - redundant with function name
def calculate_total(items):
    """Calculate the total"""  # Adds no value
    return sum(item.price for item in items)

# Bad - commented-out code
def process_order(order):
    validate_order(order)
    # old_calculation = order.total * 0.9
    # if order.customer.is_premium:
    #     old_calculation *= 0.95
    apply_discount(order)
    save_order(order)

# Bad - using comment to explain bad code
def calc(x, y, z):  # x is price, y is quantity, z is discount
    return x * y * z  # Bad names require explanation

When to Comment:

  • Explaining complex algorithms or business rules
  • Warning about consequences or constraints
  • Legal requirements (copyright, licenses)
  • TODOs for future work (with ticket numbers)
  • Documenting public APIs (docstrings)

When NOT to Comment:

  • When good naming would suffice
  • When refactoring would make code clear
  • To explain bad code instead of fixing it
  • To keep old code versions (use version control instead)

Common Pitfalls

Pitfall 1: Inconsistent Naming Styles

Mixing naming conventions (camelCase and snake_case, or meaningful names with cryptic abbreviations) creates cognitive dissonance and slows code reading.

Best Practice: Adopt your language’s conventional style guide (PEP 8 for Python, Google Style Guide for JavaScript, etc.) and enforce it with automated tools. Consistency matters more than personal preference [3].

Pitfall 2: Over-DRYing Code

Aggressively eliminating all duplication can create inappropriate coupling, where unrelated code shares an abstraction that makes both harder to change.

Best Practice: Apply the Rule of Three—wait until you have three instances before extracting an abstraction. Prefer duplication over wrong abstraction. Ask: “If requirements for these two pieces change independently, will the shared abstraction become awkward?” [6]

Pitfall 3: Writing Overly Long Functions

Functions that scroll off the screen make it impossible to understand what they do at a glance. They inevitably do multiple things, violating the Single Responsibility Principle.

Best Practice: Keep functions small (5-20 lines typically). If a function is longer, look for extraction opportunities. Each function should have a single, clear purpose that its name describes [2].

Pitfall 4: Neglecting Code Formatting

Inconsistent indentation, spacing, and line breaks make code harder to read and increase merge conflicts in version control.

Best Practice: Use automated formatters (Black, Prettier, gofmt) that enforce consistent formatting without manual effort or debate. Configure them at the project level and run them automatically on commit [3].

Pitfall 5: Excessive Comments to Compensate for Unclear Code

Using comments to explain what code does suggests the code isn’t clear enough. Comments become outdated as code changes, creating misleading documentation.

Best Practice: Make code self-documenting through clear names and structure. Refactor unclear code rather than adding explanatory comments. Use comments only for “why” explanations, not “what” explanations [2].


Summary

Clean code is characterized by clarity, simplicity, and maintainability. It communicates intent clearly to human readers while executing correctly for computers. The principles and practices in this lesson aren’t arbitrary aesthetics—they’re proven techniques that reduce bugs, accelerate development, and create software that withstands change.

Meaningful names are the foundation of clean code. Names should reveal intent, avoid disinformation, make meaningful distinctions, and be pronounceable and searchable. Use nouns for variables and classes, verbs for functions. Follow language conventions consistently. Remember: if a name requires a comment to explain it, the name needs improvement.

Code formatting and structure create visual predictability that reduces cognitive load. Organize code vertically with related concepts close together, separate concerns with blank lines, and keep lines reasonably short. Use consistent indentation to show structure. Automated formatters eliminate debates and enforce consistency without effort.

The DRY principle—Don’t Repeat Yourself—eliminates duplication of knowledge, ensuring changes happen in one place. Look for repeated validation, calculation, or structural patterns. Extract them into reusable functions or classes. But avoid premature abstraction; use the Rule of Three and prefer duplication over inappropriate coupling.

Clean functions are small, do one thing, have descriptive names, take few arguments, and have no side effects. They read like well-organized prose, with main functions providing high-level steps and helper functions handling details. Extract functions aggressively—if you can pull out another function, you probably should.

Code smells indicate refactoring opportunities: long methods, large classes, duplicate code, long parameter lists, primitive obsession, and feature envy. Address smells through systematic refactoring—making small, tested changes that improve structure without altering behavior. Refactor continuously as you add features, not as separate cleanup projects.

Comments should explain “why,” not “what.” Good code is self-documenting through clear names and structure. Use comments for intent, warnings, legal requirements, and TODOs. Avoid obvious comments, outdated comments, commented-out code, and comments that excuse bad code. When tempted to write an explanatory comment, consider whether refactoring would make the code clearer.

Clean code requires discipline and practice. It’s not about following rigid rules but consistently making choices that prioritize clarity and maintainability. As you write code, constantly ask: “Will another developer (or future me) understand this easily?” That question, more than any specific rule, guides you toward clean code.


Practice Quiz

Question 1: You encounter this code: int d; // elapsed time in days. What clean code principles are violated, and how would you fix it?

Answer: This violates multiple naming principles:

  1. Cryptic name: d doesn’t reveal intent—you need the comment to understand it
  2. Comment dependency: The name requires explanation, indicating it’s inadequate
  3. Not searchable: Finding all uses of “days” by searching for d is impossible

Fix:

elapsed_time_in_days = 0
# Or if it's a constant:
ELAPSED_DAYS = 0

This name is self-documenting—no comment needed. It’s pronounceable (“elapsed time in days”) and searchable (find all uses by searching “elapsed_time_in_days” or “days”). The comment becomes redundant and can be removed [2].

Question 2: This validation logic appears in five different functions in your codebase:

if not email or '@' not in email or '.' not in email:
    raise ValueError("Invalid email")

What principle is violated, and what’s the proper fix?

Answer: This violates the DRY principle (Don’t Repeat Yourself). The email validation knowledge exists in five places, meaning any change to validation rules requires updating all five locations, risking inconsistency and bugs [5].

Fix:

def validate_email(email):
    """Single source of truth for email validation"""
    if not email or '@' not in email or '.' not in email:
        raise ValueError("Invalid email")
    return email

# All five functions now use:
validate_email(user_email)

Now validation logic exists in one place. Changes to validation rules (like adding domain whitelist) happen once and apply everywhere. This creates a single source of truth and eliminates maintenance burden [5].

Question 3: You have a function process_order that is 150 lines long. It validates input, fetches data from the database, calculates totals, applies discounts, saves the order, sends emails, and updates inventory. What clean code principles does this violate?

Answer: This violates Single Responsibility Principle and the guideline that functions should be small and do one thing [2].

The function has multiple reasons to change:

  • Validation rules change
  • Database schema changes
  • Discount calculation changes
  • Email templates change
  • Inventory system changes

Fix: Extract focused functions:

def process_order(order_data):
    """Orchestrate order processing"""
    validate_order_data(order_data)
    customer = fetch_customer(order_data['customer_id'])
    items = fetch_items(order_data['items'])
    total = calculate_total(items, customer)
    order = save_order(customer, items, total)
    send_order_confirmation(customer, order)
    update_inventory(items)
    return order.id

Each extracted function does one thing, is easy to test, and is easy to understand. The main function reads like a table of contents, showing the high-level process [2].

Question 4: Your colleague argues that this function doesn’t need refactoring because “it works fine”:

def calc(x, y, z, f):
    r = x * y
    if f:
        r = r * 0.9
    r = r - z
    return r

What clean code principles are violated, and how would you refactor this?

Answer: Multiple violations:

  1. Cryptic names: calc, x, y, z, f, and r reveal no intent
  2. Boolean flag argument: f suggests the function does two different things based on the flag
  3. No documentation: Without comments, the business logic is opaque

The argument “it works” misses the point—code is read 10 times more than written. This code is difficult to understand, modify, and test [2].

Refactored:

def calculate_final_price(base_price, quantity, discount_amount, apply_loyalty_discount=False):
    """
    Calculate final price after discounts.

    Args:
        base_price: Unit price of the item
        quantity: Number of items
        discount_amount: Fixed discount to apply
        apply_loyalty_discount: Whether to apply 10% loyalty discount

    Returns:
        Final price after all discounts
    """
    subtotal = base_price * quantity

    if apply_loyalty_discount:
        subtotal *= 0.9  # 10% loyalty discount

    final_price = subtotal - discount_amount
    return final_price

Now the function is self-documenting with clear names, has a descriptive docstring, uses meaningful variable names, and explicitly documents the 10% discount magic number [2].

Question 5: You’re reviewing code that has extensive comments explaining what each line does:

# Initialize counter to zero
count = 0

# Loop through all users
for user in users:
    # Check if user is active
    if user.is_active:
        # Increment counter
        count += 1

# Return the counter
return count

What’s the problem with these comments, and how should this be improved?

Answer: These comments violate the principle that good code is self-documenting and that comments should explain “why,” not “what” [2].

These comments add no value—they merely restate what the code obviously does. They create maintenance burden (comments must be updated when code changes) without providing insight. Worse, they clutter the code, making it harder to scan.

Improved version:

def count_active_users(users):
    """Return the number of active users in the provided list"""
    return sum(1 for user in users if user.is_active)

This version:

  • Uses a descriptive function name that documents purpose
  • Uses idiomatic Python (generator expression with sum)
  • Has zero comments because the code is clear
  • Is more concise (1 line vs 7 lines with comments)
  • Includes a docstring documenting the function’s purpose

If the business logic why we count active users needs explanation, that’s when a comment adds value:

def count_active_users(users):
    """
    Count active users for billing calculation.
    Only active users are billed in the monthly invoice.
    """
    return sum(1 for user in users if user.is_active)

This comment explains why, which isn’t obvious from code [2].


References

[1] Fowler, M. (2018). Refactoring: Improving the Design of Existing Code (2nd Edition). Addison-Wesley. URL: https://martinfowler.com/books/refactoring.html, Quote: “Any fool can write code that a computer can understand. Good programmers write code that humans can understand. The true measure of good code is how easy it is to change.”

[2] Martin, R. C. (2008). Clean Code: A Handbook of Agile Software Craftsmanship. Prentice Hall. URL: https://www.pearson.com/en-us/subject-catalog/p/clean-code-a-handbook-of-agile-software-craftsmanship/P200000009044, Quote: “The ratio of time spent reading versus writing code is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. Because this ratio is so high, we want the reading of code to be easy, even if it makes the writing harder. Clean code is characterized by meaningful names, small focused functions, and elimination of duplication.”

[3] PEP 8 – Style Guide for Python Code. Python Software Foundation. URL: https://peps.python.org/pep-0008/, Quote: “Code is read much more often than it is written. The guidelines provided here are intended to improve the readability of code and make it consistent across the wide spectrum of Python code. Readability counts. A style guide is about consistency, and consistency within a project is most important.”

[4] Bloch, J. (2018). Effective Java (3rd Edition). Addison-Wesley. URL: https://www.pearson.com/en-us/subject-catalog/p/effective-java/P200000009024, Quote: “Minimize the accessibility of classes and members. Make each class or member as inaccessible as possible. This is one of the most important principles of good design. By limiting accessibility, you minimize the coupling between classes and preserve the flexibility to change your implementation.”

[5] Hunt, A., & Thomas, D. (2019). The Pragmatic Programmer: Your Journey to Mastery (20th Anniversary Edition). Addison-Wesley. URL: https://pragprog.com/titles/tpp20/the-pragmatic-programmer-20th-anniversary-edition/, Quote: “DRY—Don’t Repeat Yourself. Every piece of knowledge must have a single, unambiguous, authoritative representation within a system. When you find yourself writing code that is similar to something you’ve written before, take a moment to think about what you’re doing. Chances are, you can reorganize things to eliminate the duplication.”

[6] Sandi Metz. (2016). “The Wrong Abstraction.” URL: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction, Quote: “Duplication is far cheaper than the wrong abstraction. Prefer duplication over the wrong abstraction. The Rule of Three suggests that you should wait until you have three instances before introducing an abstraction. This helps you avoid premature generalization that creates inappropriate coupling.”

[7] Fowler, M., Beck, K., et al. (1999). Refactoring: Improving the Design of Existing Code. Addison-Wesley. URL: https://refactoring.com/, Quote: “Code smells are indicators that code might need refactoring. They’re not bugs—the code works—but they suggest structural problems that will make the code harder to maintain over time. Common smells include long methods, large classes, duplicate code, and long parameter lists.”