Lesson 13 - Pandas Plot Method
Quick Plotting with Pandas
You have learned matplotlib plotting with plt.plot() and plt.hist(). Now you will learn pandas plotting—a faster, more convenient way to create visualizations directly from DataFrames.
By the end of this lesson, you will be able to:
- Use
.plot()method on DataFrames and Series - Create line, bar, histogram, and scatter plots with one line
- Plot multiple columns simultaneously
- Customize pandas plots
- Choose between matplotlib and pandas plotting
- Use pandas for rapid exploratory analysis
Pandas plotting is built on matplotlib but requires less code for common visualizations.
Why Pandas Plotting?
Matplotlib vs Pandas
Matplotlib approach:
plt.plot(bikes['dteday'], bikes['cnt'])
plt.xlabel('Date')
plt.ylabel('Rentals')
plt.title('Bike Rentals Over Time')Pandas approach:
bikes.plot(x='dteday', y='cnt', title='Bike Rentals Over Time')Advantages of pandas plotting:
- Less code for standard plots
- Automatic labels from column names
- Easy to plot multiple columns
- Perfect for quick exploration
- Still uses matplotlib underneath (full customization available)
When to use pandas:
- Exploratory data analysis
- Quick visualizations during analysis
- Plotting directly from DataFrame operations
When to use matplotlib:
- Complex multi-panel figures
- Heavy customization needed
- Publication-quality graphics
Basic Line Plots
Single Column
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
bikes['dteday'] = pd.to_datetime(bikes['dteday'])
# Pandas plotting
bikes.plot(x='dteday', y='cnt', figsize=(12, 6))
plt.xlabel('Date')
plt.ylabel('Total Rentals')
plt.title('Daily Bike Rentals (2011-2012)')
plt.grid(True, alpha=0.3)
plt.show()What happened:
.plot()created the line plotxandyspecified columnsfigsizecontrolled size- Still use
plt.xlabel(),plt.title()for customization
Multiple Columns
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
bikes['dteday'] = pd.to_datetime(bikes['dteday'])
# Plot casual and registered rentals together
bikes.plot(x='dteday', y=['casual', 'registered'], figsize=(12, 6))
plt.xlabel('Date')
plt.ylabel('Bike Rentals')
plt.title('Casual vs Registered Users Over Time')
plt.grid(True, alpha=0.3)
plt.legend(title='User Type')
plt.show()Key feature: Pass list of columns to y parameter → automatic legend and multiple lines!
Bar Plots
Vertical Bars
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
# Calculate mean by season
season_avg = bikes.groupby('season')['cnt'].mean()
season_avg.index = ['Spring', 'Summer', 'Fall', 'Winter']
# Pandas bar plot
season_avg.plot(kind='bar', figsize=(10, 6), color='steelblue', edgecolor='black')
plt.xlabel('Season')
plt.ylabel('Average Daily Rentals')
plt.title('Average Bike Rentals by Season')
plt.xticks(rotation=0)
plt.grid(True, alpha=0.3, axis='y')
plt.show()Note: kind='bar' creates vertical bars. Alternative: .plot.bar()
Horizontal Bars
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
# Weather impact
weather_avg = bikes.groupby('weathersit')['cnt'].mean()
weather_avg.index = ['Clear', 'Mist', 'Light Rain/Snow']
# Horizontal bar plot
weather_avg.plot(kind='barh', figsize=(10, 6), color='coral', edgecolor='black')
plt.xlabel('Average Daily Rentals')
plt.ylabel('Weather Condition')
plt.title('Weather Impact on Bike Rentals')
plt.grid(True, alpha=0.3, axis='x')
plt.show()Note: kind='barh' creates horizontal bars. Alternative: .plot.barh()
Histograms
Quick Distribution Analysis
import pandas as pd
import matplotlib.pyplot as plt
bikes_hour = pd.read_csv('hour.csv')
# Histogram with pandas
bikes_hour['cnt'].plot(kind='hist', bins=40, figsize=(10, 6),
color='skyblue', edgecolor='black', alpha=0.7)
plt.xlabel('Hourly Bike Rentals')
plt.ylabel('Frequency')
plt.title('Distribution of Hourly Rentals')
plt.grid(True, alpha=0.3, axis='y')
plt.show()Note: kind='hist' creates histogram. Alternative: .plot.hist()
Multiple Distributions
import pandas as pd
import matplotlib.pyplot as plt
bikes_hour = pd.read_csv('hour.csv')
# Plot casual and registered distributions
bikes_hour[['casual', 'registered']].plot(kind='hist', bins=40,
alpha=0.6, figsize=(10, 6),
edgecolor='black')
plt.xlabel('Hourly Rentals')
plt.ylabel('Frequency')
plt.title('Rental Distribution by User Type')
plt.legend(title='User Type')
plt.grid(True, alpha=0.3, axis='y')
plt.show()Powerful feature: Select multiple columns and plot their distributions together!
Scatter Plots
Two Column Relationship
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
# Scatter plot with pandas
bikes.plot(kind='scatter', x='temp', y='cnt', figsize=(10, 6),
alpha=0.5, s=30, color='steelblue')
plt.xlabel('Normalized Temperature')
plt.ylabel('Daily Bike Rentals')
plt.title('Temperature vs Bike Rentals')
plt.grid(True, alpha=0.3)
plt.show()Note: kind='scatter' requires both x and y. Alternative: .plot.scatter()
Color by Category
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
# Color points by season
colors = {1: 'lightgreen', 2: 'gold', 3: 'orange', 4: 'lightblue'}
bike_colors = bikes['season'].map(colors)
bikes.plot(kind='scatter', x='temp', y='cnt', c=bike_colors,
figsize=(10, 6), alpha=0.6, s=40, edgecolor='black')
plt.xlabel('Normalized Temperature')
plt.ylabel('Daily Bike Rentals')
plt.title('Temperature vs Rentals (colored by season)')
plt.grid(True, alpha=0.3)
plt.show()Plotting from Grouped Data
Aggregation + Plotting
import pandas as pd
import matplotlib.pyplot as plt
bikes_hour = pd.read_csv('hour.csv')
# Group by hour and calculate mean
hourly_pattern = bikes_hour.groupby('hr')['cnt'].mean()
# Plot directly
hourly_pattern.plot(kind='line', figsize=(12, 6), linewidth=2, color='darkblue')
plt.xlabel('Hour of Day')
plt.ylabel('Average Bike Rentals')
plt.title('Average Hourly Bike Rental Pattern')
plt.xticks(range(0, 24))
plt.grid(True, alpha=0.3)
plt.show()Workflow: GroupBy → Aggregate → Plot in one smooth pipeline!
Multiple Aggregations
import pandas as pd
import matplotlib.pyplot as plt
bikes_hour = pd.read_csv('hour.csv')
# Group by hour, get casual and registered means
hourly_by_type = bikes_hour.groupby('hr')[['casual', 'registered']].mean()
# Plot both
hourly_by_type.plot(figsize=(12, 6), linewidth=2)
plt.xlabel('Hour of Day')
plt.ylabel('Average Rentals')
plt.title('Hourly Patterns: Casual vs Registered Users')
plt.xticks(range(0, 24))
plt.grid(True, alpha=0.3)
plt.legend(title='User Type', loc='upper left')
plt.show()Insight: Different usage patterns—registered peak at commute hours, casual peak midday.
Box Plots
Distribution Summary
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
# Box plot by season
season_data = []
season_labels = []
for season in [1, 2, 3, 4]:
season_data.append(bikes[bikes['season'] == season]['cnt'])
season_labels.append(['Spring', 'Summer', 'Fall', 'Winter'][season-1])
plt.figure(figsize=(10, 6))
plt.boxplot(season_data, labels=season_labels)
plt.ylabel('Daily Bike Rentals')
plt.title('Rental Distribution by Season (Box Plot)')
plt.grid(True, alpha=0.3, axis='y')
plt.show()Box plot shows:
- Median (middle line)
- Quartiles (box edges)
- Outliers (dots)
- Range (whiskers)
Customizing Pandas Plots
Common Parameters
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
bikes['dteday'] = pd.to_datetime(bikes['dteday'])
# Fully customized pandas plot
bikes.plot(
x='dteday',
y='cnt',
kind='line',
figsize=(14, 6),
color='darkgreen',
linewidth=2,
linestyle='--',
marker='o',
markersize=3,
alpha=0.7,
grid=True,
title='Daily Bike Rentals with Customization'
)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Total Rentals', fontsize=12)
plt.show()Available parameters:
figsize: (width, height) in inchescolor: line/bar colorlinewidth: line thicknesslinestyle: ‘-’, ‘–’, ‘-.’, ‘:’marker: ‘o’, ’s’, ‘^’, etc.alpha: transparency 0-1grid: True/Falsetitle: plot title
Practical Examples
Quick Exploratory Analysis
import pandas as pd
import matplotlib.pyplot as plt
bikes = pd.read_csv('day.csv')
# Create 2x2 grid of quick plots
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
# Plot 1: Rentals over time
bikes.plot(x='dteday', y='cnt', ax=axes[0, 0], color='steelblue', legend=False)
axes[0, 0].set_title('Daily Rentals Over Time')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Rentals')
axes[0, 0].grid(True, alpha=0.3)
# Plot 2: Distribution
bikes['cnt'].plot(kind='hist', bins=30, ax=axes[0, 1],
color='coral', edgecolor='black', alpha=0.7, legend=False)
axes[0, 1].set_title('Rental Distribution')
axes[0, 1].set_xlabel('Daily Rentals')
axes[0, 1].set_ylabel('Frequency')
axes[0, 1].grid(True, alpha=0.3, axis='y')
# Plot 3: Temperature vs Rentals
bikes.plot(kind='scatter', x='temp', y='cnt', ax=axes[1, 0],
alpha=0.5, color='green', s=20)
axes[1, 0].set_title('Temperature vs Rentals')
axes[1, 0].set_xlabel('Temperature')
axes[1, 0].set_ylabel('Rentals')
axes[1, 0].grid(True, alpha=0.3)
# Plot 4: Seasonal averages
season_avg = bikes.groupby('season')['cnt'].mean()
season_avg.index = ['Spring', 'Summer', 'Fall', 'Winter']
season_avg.plot(kind='bar', ax=axes[1, 1], color='orange', edgecolor='black')
axes[1, 1].set_title('Average Rentals by Season')
axes[1, 1].set_ylabel('Average Rentals')
axes[1, 1].tick_params(axis='x', rotation=0)
axes[1, 1].grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()Weather Analysis Dashboard
import pandas as pd
import matplotlib.pyplot as plt
bikes_hour = pd.read_csv('hour.csv')
# Weather variables
weather_cols = ['temp', 'atemp', 'hum', 'windspeed']
weather_names = ['Temperature', 'Feels-Like Temp', 'Humidity', 'Wind Speed']
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
for ax, col, name in zip(axes.flat, weather_cols, weather_names):
bikes_hour.plot(kind='scatter', x=col, y='cnt', ax=ax,
alpha=0.3, s=10, color='steelblue')
# Add correlation
corr = bikes_hour[col].corr(bikes_hour['cnt'])
ax.set_title(f'{name} vs Rentals (r = {corr:.3f})')
ax.set_xlabel(name)
ax.set_ylabel('Hourly Rentals')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()Summary
You learned pandas plotting for quick visualizations:
.plot()method works on DataFrames and Serieskindparameter: ’line’, ‘bar’, ‘barh’, ‘hist’, ‘scatter’, ‘box’- Alternative syntax:
.plot.line(),.plot.bar(), etc. - Automatic features: labels from column names, legends for multiple columns
- Less code than pure matplotlib for standard plots
- Combine with groupby for powerful aggregation + plotting
- Still use matplotlib for final customization (
plt.xlabel(),plt.title()) - Best for: exploratory analysis, quick checks, simple visualizations
Next Steps: In the next lesson, you will learn to create subplot grids for comprehensive multi-panel visualizations.
Practice: Use pandas plotting to create a histogram of temperature and a scatter plot of humidity vs rentals. Add proper titles and labels.