Lesson 13 - Pandas Plot Method

Quick Plotting with Pandas

You have learned matplotlib plotting with plt.plot() and plt.hist(). Now you will learn pandas plotting—a faster, more convenient way to create visualizations directly from DataFrames.

By the end of this lesson, you will be able to:

  • Use .plot() method on DataFrames and Series
  • Create line, bar, histogram, and scatter plots with one line
  • Plot multiple columns simultaneously
  • Customize pandas plots
  • Choose between matplotlib and pandas plotting
  • Use pandas for rapid exploratory analysis

Pandas plotting is built on matplotlib but requires less code for common visualizations.


Why Pandas Plotting?

Matplotlib vs Pandas

Matplotlib approach:

plt.plot(bikes['dteday'], bikes['cnt'])
plt.xlabel('Date')
plt.ylabel('Rentals')
plt.title('Bike Rentals Over Time')

Pandas approach:

bikes.plot(x='dteday', y='cnt', title='Bike Rentals Over Time')

Advantages of pandas plotting:

  • Less code for standard plots
  • Automatic labels from column names
  • Easy to plot multiple columns
  • Perfect for quick exploration
  • Still uses matplotlib underneath (full customization available)

When to use pandas:

  • Exploratory data analysis
  • Quick visualizations during analysis
  • Plotting directly from DataFrame operations

When to use matplotlib:

  • Complex multi-panel figures
  • Heavy customization needed
  • Publication-quality graphics

Basic Line Plots

Single Column

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')
bikes['dteday'] = pd.to_datetime(bikes['dteday'])

# Pandas plotting
bikes.plot(x='dteday', y='cnt', figsize=(12, 6))
plt.xlabel('Date')
plt.ylabel('Total Rentals')
plt.title('Daily Bike Rentals (2011-2012)')
plt.grid(True, alpha=0.3)
plt.show()

What happened:

  • .plot() created the line plot
  • x and y specified columns
  • figsize controlled size
  • Still use plt.xlabel(), plt.title() for customization

Multiple Columns

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')
bikes['dteday'] = pd.to_datetime(bikes['dteday'])

# Plot casual and registered rentals together
bikes.plot(x='dteday', y=['casual', 'registered'], figsize=(12, 6))
plt.xlabel('Date')
plt.ylabel('Bike Rentals')
plt.title('Casual vs Registered Users Over Time')
plt.grid(True, alpha=0.3)
plt.legend(title='User Type')
plt.show()

Key feature: Pass list of columns to y parameter → automatic legend and multiple lines!


Bar Plots

Vertical Bars

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')

# Calculate mean by season
season_avg = bikes.groupby('season')['cnt'].mean()
season_avg.index = ['Spring', 'Summer', 'Fall', 'Winter']

# Pandas bar plot
season_avg.plot(kind='bar', figsize=(10, 6), color='steelblue', edgecolor='black')
plt.xlabel('Season')
plt.ylabel('Average Daily Rentals')
plt.title('Average Bike Rentals by Season')
plt.xticks(rotation=0)
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Note: kind='bar' creates vertical bars. Alternative: .plot.bar()

Horizontal Bars

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')

# Weather impact
weather_avg = bikes.groupby('weathersit')['cnt'].mean()
weather_avg.index = ['Clear', 'Mist', 'Light Rain/Snow']

# Horizontal bar plot
weather_avg.plot(kind='barh', figsize=(10, 6), color='coral', edgecolor='black')
plt.xlabel('Average Daily Rentals')
plt.ylabel('Weather Condition')
plt.title('Weather Impact on Bike Rentals')
plt.grid(True, alpha=0.3, axis='x')
plt.show()

Note: kind='barh' creates horizontal bars. Alternative: .plot.barh()


Histograms

Quick Distribution Analysis

import pandas as pd
import matplotlib.pyplot as plt

bikes_hour = pd.read_csv('hour.csv')

# Histogram with pandas
bikes_hour['cnt'].plot(kind='hist', bins=40, figsize=(10, 6),
                       color='skyblue', edgecolor='black', alpha=0.7)
plt.xlabel('Hourly Bike Rentals')
plt.ylabel('Frequency')
plt.title('Distribution of Hourly Rentals')
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Note: kind='hist' creates histogram. Alternative: .plot.hist()

Multiple Distributions

import pandas as pd
import matplotlib.pyplot as plt

bikes_hour = pd.read_csv('hour.csv')

# Plot casual and registered distributions
bikes_hour[['casual', 'registered']].plot(kind='hist', bins=40,
                                           alpha=0.6, figsize=(10, 6),
                                           edgecolor='black')
plt.xlabel('Hourly Rentals')
plt.ylabel('Frequency')
plt.title('Rental Distribution by User Type')
plt.legend(title='User Type')
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Powerful feature: Select multiple columns and plot their distributions together!


Scatter Plots

Two Column Relationship

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')

# Scatter plot with pandas
bikes.plot(kind='scatter', x='temp', y='cnt', figsize=(10, 6),
           alpha=0.5, s=30, color='steelblue')
plt.xlabel('Normalized Temperature')
plt.ylabel('Daily Bike Rentals')
plt.title('Temperature vs Bike Rentals')
plt.grid(True, alpha=0.3)
plt.show()

Note: kind='scatter' requires both x and y. Alternative: .plot.scatter()

Color by Category

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')

# Color points by season
colors = {1: 'lightgreen', 2: 'gold', 3: 'orange', 4: 'lightblue'}
bike_colors = bikes['season'].map(colors)

bikes.plot(kind='scatter', x='temp', y='cnt', c=bike_colors,
           figsize=(10, 6), alpha=0.6, s=40, edgecolor='black')
plt.xlabel('Normalized Temperature')
plt.ylabel('Daily Bike Rentals')
plt.title('Temperature vs Rentals (colored by season)')
plt.grid(True, alpha=0.3)
plt.show()

Plotting from Grouped Data

Aggregation + Plotting

import pandas as pd
import matplotlib.pyplot as plt

bikes_hour = pd.read_csv('hour.csv')

# Group by hour and calculate mean
hourly_pattern = bikes_hour.groupby('hr')['cnt'].mean()

# Plot directly
hourly_pattern.plot(kind='line', figsize=(12, 6), linewidth=2, color='darkblue')
plt.xlabel('Hour of Day')
plt.ylabel('Average Bike Rentals')
plt.title('Average Hourly Bike Rental Pattern')
plt.xticks(range(0, 24))
plt.grid(True, alpha=0.3)
plt.show()

Workflow: GroupBy → Aggregate → Plot in one smooth pipeline!

Multiple Aggregations

import pandas as pd
import matplotlib.pyplot as plt

bikes_hour = pd.read_csv('hour.csv')

# Group by hour, get casual and registered means
hourly_by_type = bikes_hour.groupby('hr')[['casual', 'registered']].mean()

# Plot both
hourly_by_type.plot(figsize=(12, 6), linewidth=2)
plt.xlabel('Hour of Day')
plt.ylabel('Average Rentals')
plt.title('Hourly Patterns: Casual vs Registered Users')
plt.xticks(range(0, 24))
plt.grid(True, alpha=0.3)
plt.legend(title='User Type', loc='upper left')
plt.show()

Insight: Different usage patterns—registered peak at commute hours, casual peak midday.


Box Plots

Distribution Summary

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')

# Box plot by season
season_data = []
season_labels = []
for season in [1, 2, 3, 4]:
    season_data.append(bikes[bikes['season'] == season]['cnt'])
    season_labels.append(['Spring', 'Summer', 'Fall', 'Winter'][season-1])

plt.figure(figsize=(10, 6))
plt.boxplot(season_data, labels=season_labels)
plt.ylabel('Daily Bike Rentals')
plt.title('Rental Distribution by Season (Box Plot)')
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Box plot shows:

  • Median (middle line)
  • Quartiles (box edges)
  • Outliers (dots)
  • Range (whiskers)

Customizing Pandas Plots

Common Parameters

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')
bikes['dteday'] = pd.to_datetime(bikes['dteday'])

# Fully customized pandas plot
bikes.plot(
    x='dteday',
    y='cnt',
    kind='line',
    figsize=(14, 6),
    color='darkgreen',
    linewidth=2,
    linestyle='--',
    marker='o',
    markersize=3,
    alpha=0.7,
    grid=True,
    title='Daily Bike Rentals with Customization'
)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Total Rentals', fontsize=12)
plt.show()

Available parameters:

  • figsize: (width, height) in inches
  • color: line/bar color
  • linewidth: line thickness
  • linestyle: ‘-’, ‘–’, ‘-.’, ‘:’
  • marker: ‘o’, ’s’, ‘^’, etc.
  • alpha: transparency 0-1
  • grid: True/False
  • title: plot title

Practical Examples

Quick Exploratory Analysis

import pandas as pd
import matplotlib.pyplot as plt

bikes = pd.read_csv('day.csv')

# Create 2x2 grid of quick plots
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: Rentals over time
bikes.plot(x='dteday', y='cnt', ax=axes[0, 0], color='steelblue', legend=False)
axes[0, 0].set_title('Daily Rentals Over Time')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Rentals')
axes[0, 0].grid(True, alpha=0.3)

# Plot 2: Distribution
bikes['cnt'].plot(kind='hist', bins=30, ax=axes[0, 1],
                  color='coral', edgecolor='black', alpha=0.7, legend=False)
axes[0, 1].set_title('Rental Distribution')
axes[0, 1].set_xlabel('Daily Rentals')
axes[0, 1].set_ylabel('Frequency')
axes[0, 1].grid(True, alpha=0.3, axis='y')

# Plot 3: Temperature vs Rentals
bikes.plot(kind='scatter', x='temp', y='cnt', ax=axes[1, 0],
           alpha=0.5, color='green', s=20)
axes[1, 0].set_title('Temperature vs Rentals')
axes[1, 0].set_xlabel('Temperature')
axes[1, 0].set_ylabel('Rentals')
axes[1, 0].grid(True, alpha=0.3)

# Plot 4: Seasonal averages
season_avg = bikes.groupby('season')['cnt'].mean()
season_avg.index = ['Spring', 'Summer', 'Fall', 'Winter']
season_avg.plot(kind='bar', ax=axes[1, 1], color='orange', edgecolor='black')
axes[1, 1].set_title('Average Rentals by Season')
axes[1, 1].set_ylabel('Average Rentals')
axes[1, 1].tick_params(axis='x', rotation=0)
axes[1, 1].grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

Weather Analysis Dashboard

import pandas as pd
import matplotlib.pyplot as plt

bikes_hour = pd.read_csv('hour.csv')

# Weather variables
weather_cols = ['temp', 'atemp', 'hum', 'windspeed']
weather_names = ['Temperature', 'Feels-Like Temp', 'Humidity', 'Wind Speed']

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

for ax, col, name in zip(axes.flat, weather_cols, weather_names):
    bikes_hour.plot(kind='scatter', x=col, y='cnt', ax=ax,
                    alpha=0.3, s=10, color='steelblue')

    # Add correlation
    corr = bikes_hour[col].corr(bikes_hour['cnt'])
    ax.set_title(f'{name} vs Rentals (r = {corr:.3f})')
    ax.set_xlabel(name)
    ax.set_ylabel('Hourly Rentals')
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Summary

You learned pandas plotting for quick visualizations:

  • .plot() method works on DataFrames and Series
  • kind parameter: ’line’, ‘bar’, ‘barh’, ‘hist’, ‘scatter’, ‘box’
  • Alternative syntax: .plot.line(), .plot.bar(), etc.
  • Automatic features: labels from column names, legends for multiple columns
  • Less code than pure matplotlib for standard plots
  • Combine with groupby for powerful aggregation + plotting
  • Still use matplotlib for final customization (plt.xlabel(), plt.title())
  • Best for: exploratory analysis, quick checks, simple visualizations

Next Steps: In the next lesson, you will learn to create subplot grids for comprehensive multi-panel visualizations.

Practice: Use pandas plotting to create a histogram of temperature and a scatter plot of humidity vs rentals. Add proper titles and labels.