Lesson 1 - Deep Learning Fundamentals

Welcome to Deep Learning with PyTorch

This lesson introduces deep learning: what it is, where it outperforms classical machine learning, and why nearly everyone who builds neural networks reaches for a framework like PyTorch instead of writing the math from scratch. You will see the high-level training loop that every neural network follows, and you will meet the real dataset that runs through this entire module: a collection of Indian IPOs where the goal is to predict whether each one lists with a gain.

By the end of this lesson, you will be able to:

  • Explain what deep learning is and how it relates to classical machine learning
  • Describe where deep learning shines and where classical methods are still the better choice
  • Explain why a framework like PyTorch matters: autograd, GPU acceleration, and reusable layers
  • Trace the high-level neural network training loop from data to an optimizer step
  • Load the Indian IPO dataset, inspect its features and target, and frame the prediction problem

You should be comfortable with basic Python, pandas, and NumPy, and ideally have built at least one model with a classical library like scikit-learn. No prior deep learning experience is needed. Let’s begin.


What Is Deep Learning?

Deep learning is a branch of machine learning built around artificial neural networks: models loosely inspired by the way biological neurons connect and pass signals to one another. Like any machine learning model, a neural network learns patterns from data rather than following hand-written rules. What makes it deep is structure: it stacks many layers of simple computing units, and each layer transforms the data a little more before passing it to the next.

That stacking is the whole idea. A classical model like logistic regression takes your raw features and draws essentially one decision boundary through them. A deep network takes your raw features, reshapes them through a first layer into a new internal representation, reshapes that through a second layer, and so on. By the time the data reaches the final layer, the network is working with features it invented for itself, features that may capture interactions no human would have thought to write down.

This ability to build its own features is called representation learning, and it is the reason deep learning has set records in image recognition, speech, and language understanding. You do not tell the network “look for edges, then shapes, then faces.” You give it raw pixels and a training signal, and it discovers a useful hierarchy of features on its own.

A Neuron and a Network

The basic computing unit in a neural network is a neuron (also called a node). A neuron takes several input numbers, multiplies each by a learned weight, sums the results, and passes that sum through a small nonlinear function. That nonlinear step is what lets a stack of neurons represent curves and interactions instead of just straight lines.

Neurons are organized into layers:

   Input layer        Hidden layers            Output layer
  (your features)   (learned representations)   (the prediction)

   x1  ---\        +--------+    +--------+
           >-----> | layer  | -> | layer  | ---->  gain? (0 / 1)
   x2  ---/        |   1    |    |   2    |
   ...             +--------+    +--------+

The input layer is just your raw features. The hidden layers in the middle are where the network learns its internal representations; “deep” simply means there is more than one of them. The output layer produces the final answer, which for our problem is a single number you can read as the probability that an IPO lists with a gain.

A network with only one or two layers is called shallow; one with many is deep. More layers can capture more complex patterns, but they also cost more to train and need more data to learn from without memorizing. Choosing how deep to go is one of the trade-offs you will learn to navigate.


Deep Learning vs. Classical Machine Learning

Deep learning is powerful, but it is not always the right tool. Knowing when to reach for it, and when a simpler model will serve you better, is a mark of an experienced practitioner.

Where Deep Learning Shines

Deep learning tends to win on problems with three traits:

  • High-dimensional, unstructured input. Images, audio, and text have thousands or millions of raw values per example, with no obvious hand-crafted features. Networks excel at turning that raw signal into useful representations.
  • Lots of data. Deep networks have many parameters to fit. The more examples you can feed them, the more of their capacity pays off.
  • Complex, nonlinear patterns. When the relationship between inputs and the answer involves rich interactions that resist simple rules, extra layers earn their keep.

Where Classical Methods Still Win

For many everyday problems, classical machine learning is faster, cheaper, and just as accurate, or more so:

  • Small or modest tabular datasets. When you have a few hundred or a few thousand rows of clean numeric columns, models like logistic regression, random forests, or gradient boosting are hard to beat.
  • A need for interpretability. Linear models and trees can tell you why they made a decision. Deep networks are harder to explain.
  • Limited compute or data. Training a large network on a laptop with a tiny dataset usually just overfits.

Our dataset is deliberately small

The Indian IPO dataset you will use has only 319 rows. On a problem this size, a classical model would often match or beat a neural network. We use it anyway because it is small enough to train in seconds, which lets you focus on how PyTorch works without waiting on long training runs. The skills transfer directly to the large datasets where deep learning truly dominates.

The honest summary: deep learning is a specialized, heavy-duty tool. It is spectacular on the right problem and overkill on the wrong one. This module teaches you the tool itself; judgment about when to use it comes with practice.


Why Use a Framework Like PyTorch?

In principle, you could build a neural network with nothing but NumPy. You would write the layer math by hand, compute the predictions, measure the error, and then work out by hand how to nudge every weight to reduce that error. For a tiny network that is a useful exercise. For a real one with thousands of weights, it is hopeless: the bookkeeping for those nudges (the calculus of which weight should move and by how much) is enormous and error-prone.

This is why deep learning runs on frameworks. PyTorch is one of the most widely used, in both research and industry. It removes the painful parts and lets you describe what the network should compute while it handles how to train it. Three features do most of the heavy lifting.

1. Autograd: Automatic Differentiation

Training a network means repeatedly adjusting weights to reduce error. To know which direction to move each weight, you need its gradient, the derivative of the error with respect to that weight. PyTorch’s autograd engine computes every one of those gradients for you automatically, no matter how complicated your network is. You write the forward calculation in plain Python, and PyTorch records it and works the derivatives out on demand. You will study exactly how this works in the next lesson.

2. GPU Acceleration

Neural networks are mostly large matrix multiplications, and graphics processors (GPUs) are built to do exactly that, thousands of operations in parallel. PyTorch lets you move your data and model onto a GPU with a single line and run training many times faster, without rewriting your code.

3. Reusable Layers and Building Blocks

PyTorch ships with ready-made components: layers, activation functions, loss functions, and optimizers. Instead of coding a fully connected layer from scratch, you ask for one and connect it. You assemble a network from these blocks the way you assemble a program from functions. You will start using them in Lesson 3.

Frameworks let you think at the right level

The point of PyTorch is not that it does something you could never do by hand. It is that it lets you reason about your model in terms of layers and loss, instead of in terms of derivatives and array indices. That higher altitude is what makes it possible to design and debug real networks at all.


The High-Level Training Loop

Almost every neural network in PyTorch is trained with the same repeating cycle. You do not need to write any of it yet; the goal here is to recognize the shape of the loop so the later lessons feel familiar. The diagram below shows the cycle and how the steps feed into one another.

The PyTorch training loop showing data flowing into tensors, then a model, then a loss, then backpropagation, then an optimizer step that updates the model
The PyTorch training loop: data becomes tensors, the model makes predictions, the loss measures error, backpropagation computes gradients, and the optimizer updates the weights, over and over.

Here is what each stage does, in plain language.

Step 1: Data to Tensors

A tensor is PyTorch’s core data structure, essentially a multi-dimensional array much like a NumPy array, but able to live on a GPU and track gradients. Your first job in any project is to load your data and convert your features and target into tensors. (Lesson 2 covers tensors in depth.)

Step 2: Forward Pass Through the Model

You feed a batch of input tensors into the model. The data flows forward through the layers, each one transforming it, until the output layer produces predictions. This is the forward pass, the network’s current best guess given its current weights.

Step 3: Compute the Loss

A loss function compares the model’s predictions to the true answers and returns a single number measuring how wrong the model is. A high loss means poor predictions; a low loss means good ones. The entire point of training is to make this number small.

Step 4: Backpropagation

Calling backpropagation runs autograd backward through the network and computes the gradient of the loss with respect to every weight, that is, how much each weight contributed to the error and which way it should move.

Step 5: Optimizer Step

An optimizer takes those gradients and nudges every weight a small step in the direction that reduces the loss. After this step the model is slightly better than it was.

Then the loop repeats. Each full pass over the training data is called an epoch, and a network typically trains for many epochs, the loss falling a little each time as the weights settle into values that capture the patterns in the data. You will write this exact loop in Lesson 4.

This loop is universal

Whether the network classifies photos, translates languages, or predicts IPO gains, the training loop is the same five steps: tensors, forward pass, loss, backpropagation, optimizer step. Learn it once and you can read almost any PyTorch training code.


The Dataset for This Module: Indian IPOs

You will learn deep learning best by applying it to one real problem from start to finish. Throughout this module, that problem is predicting IPO listing gains using the Indian IPO dataset.

When a company holds an Initial Public Offering (IPO), its shares are offered to investors at a fixed issue price and then begin trading on the stock exchange. An IPO “lists with a gain” when its opening market price is higher than the issue price, so early investors are immediately in profit. Predicting that outcome ahead of time is genuinely useful, and it is a clean binary classification problem: each IPO either lists with a gain (1) or it does not (0).

You can download the dataset and load it with pandas.

import pandas as pd

# download: https://datatweets.com/datasets/indian_ipo.csv
df = pd.read_csv("indian_ipo.csv")

print("Shape:", df.shape)
# Output: Shape: (319, 10)

The dataset has 319 rows and 10 columns. Each row is one IPO, and one column records whether that IPO listed with a gain.

The Features and the Target

You will work with six numeric features as inputs and one binary target as the answer. The subscription columns describe demand: how many times over the offer was applied for by each class of investor.

ColumnTypeMeaning
Issue_SizefloatTotal size of the offering
Subscription_QIBfloatTimes subscribed by Qualified Institutional Buyers
Subscription_HNIfloatTimes subscribed by High Net-worth Individuals
Subscription_RIIfloatTimes subscribed by Retail Individual Investors
Subscription_TotalfloatOverall times the issue was subscribed
Issue_PricefloatPrice per share at which the IPO was offered
Listing_GainsintTarget: 1 if the IPO listed with a gain, else 0

The intuition behind these features is appealing: strong demand before listing (high subscription numbers) often signals that the market is eager for the stock, which can push the opening price above the issue price. Whether that intuition holds is exactly what a model can test.

How Balanced Is the Target?

Before modeling anything, always check how the answers are distributed. If one class overwhelmingly dominates, accuracy becomes a misleading score, a model could look good simply by always guessing the majority class.

print(df["Listing_Gains"].value_counts())
# Output:
# Listing_Gains
# 1    174
# 0    145
# Name: count, dtype: int64

print("gain rate:", round(df["Listing_Gains"].mean(), 3))
# Output: gain rate: 0.545

About 54.5 percent of these IPOs listed with a gain (174 out of 319). That is a comfortably balanced target, which is good news: it means accuracy is a reasonable first metric, and a model has to genuinely learn something to beat the roughly 55 percent you would get by always guessing “gain.” A picture makes the balance clear.

Bar chart comparing the count of IPOs that listed with a gain versus those that did not, showing a roughly even split
The Indian IPO target is fairly balanced: slightly more IPOs list with a gain than without.

Separating Features from the Target

The final preparation step you will revisit throughout the module is splitting the table into the feature matrix X and the target vector y. This mirrors the convention you have seen in classical machine learning.

feature_cols = [
    "Issue_Size",
    "Subscription_QIB",
    "Subscription_HNI",
    "Subscription_RII",
    "Subscription_Total",
    "Issue_Price",
]

X = df[feature_cols]            # six numeric features
y = df["Listing_Gains"]        # the binary target

print("X shape:", X.shape)
print("Gain examples:", int(y.sum()))
# Output:
# X shape: (319, 6)
# Gain examples: 174

That is the problem framed and the data understood. In the lessons ahead, you will turn this X and y into tensors, build a small neural network to map the features to a gain prediction, train it with the loop you just saw, and measure how well it does.

Set expectations honestly

Predicting IPO outcomes from a few hundred rows is genuinely hard, and there is no guarantee these six features carry a strong signal. Do not expect a near-perfect model. The aim of this module is to teach you to build, train, and evaluate neural networks in PyTorch correctly, not to find a money-making trading strategy.


Practice Exercises

Now it is your turn. Try these before checking the hints. They use only pandas, so you can run them right after loading the dataset.

Exercise 1: Inspect the Feature Ranges

Print summary statistics for the six feature columns to see how different their scales are. Which feature has by far the largest values, and why might that matter for a neural network?

import pandas as pd
df = pd.read_csv("indian_ipo.csv")  # download: https://datatweets.com/datasets/indian_ipo.csv

# Your code here

Hint

Use df[feature_cols].describe() to see the count, mean, min, and max of each column. You will notice the columns live on very different scales, which is why a later lesson scales the features before training: a network learns more easily when its inputs share a similar range.

Exercise 2: Compare the Gain Rate by Demand

Group the IPOs by whether their Subscription_Total is above the median, and compute the gain rate within each group. Does higher overall demand line up with a higher chance of listing with a gain?

# Your code here (reuse df from above)

Hint

Compute the median with df["Subscription_Total"].median(), build a boolean column high_demand = df["Subscription_Total"] > median, then use df.groupby(high_demand)["Listing_Gains"].mean(). Comparing the two group means tells you whether demand carries any signal about the outcome.

Exercise 3: Decide Deep Learning vs. Classical

This is a thinking exercise, not a coding one. For each scenario below, decide whether deep learning or a classical model is the better starting point, and write one sentence explaining why: (a) classifying 2 million product photos, (b) predicting churn from 800 rows of customer billing data, (c) deciding which model to use on our 319-row IPO dataset for a quick baseline.

Hint

Match each scenario to the trade-offs from this lesson. Lots of unstructured data favors deep learning (a); small clean tabular data favors classical models (b); and for a quick baseline on a tiny dataset, a classical model is the sensible first move, even in a deep learning course (c).


Summary

Congratulations! You now understand what deep learning is, when to use it, why frameworks exist, and what the training loop looks like, and you have met the dataset you will model for the rest of this module. Let’s review.

Key Concepts

What Deep Learning Is

  • Deep learning uses neural networks: layers of simple neurons that each multiply inputs by learned weights, sum them, and apply a nonlinear function
  • Hidden layers learn internal representations of the data; “deep” means more than one hidden layer
  • Representation learning lets a network build its own features instead of relying on hand-crafted ones

When to Use It

  • Deep learning shines on large, high-dimensional, unstructured data (images, audio, text) with complex nonlinear patterns
  • Classical machine learning is often better for small or modest tabular datasets, for interpretability, and when compute or data is limited

Why a Framework

  • Autograd computes the gradients needed for training automatically
  • GPU acceleration runs the heavy matrix math in parallel for big speedups
  • Reusable layers let you assemble networks from ready-made building blocks

The Training Loop

  • Convert data to tensors, run a forward pass to get predictions, compute the loss, run backpropagation to get gradients, and take an optimizer step to update weights
  • One full pass over the data is an epoch; networks train for many epochs

The Dataset

  • The Indian IPO dataset has 319 rows, six numeric features, and a binary target Listing_Gains
  • The target is balanced (a 0.545 gain rate), so accuracy is a reasonable first metric
  • The goal for the whole module: predict whether an IPO lists with a gain

Why This Matters

Every deep learning project you ever build will follow the same arc you just previewed: understand the data and the goal, turn it into tensors, push it through a network, measure the error, and let the optimizer improve the weights one step at a time. The architectures and datasets change, but that loop does not.

Just as important is the judgment you started developing here. Deep learning is a remarkable tool, but it is not the answer to every problem. Knowing that a 319-row tabular dataset is exactly the kind of problem where a classical model might win, and choosing to use it anyway because it lets you learn PyTorch quickly, is the kind of deliberate decision that separates someone who can run a tutorial from someone who can build real systems. The next lessons give you the mechanics; keep this big picture in mind as you go.


Next Steps

You now have the conceptual map. In the next lesson, you will get hands-on with the foundation everything else is built on: tensors and the autograd engine that makes training possible.

Continue to Lesson 2 - Tensors and Autograd in PyTorch

Learn PyTorch's core data structure and the automatic differentiation engine that powers training.

Back to Module Overview

Return to the Deep Learning with PyTorch module overview.


Keep Building Your Skills

You have taken your first step into deep learning, and you did it the right way: by understanding the problem and the tool before touching the math. The training loop you previewed here is the heartbeat of every neural network, and the Indian IPO problem will give you a concrete place to practice it end to end. Carry the big picture with you, because once you can see how data, model, loss, and optimizer fit together, the individual PyTorch pieces in the coming lessons will click into place fast.