Lesson 2 - Columns, Data Types, and Functions

Welcome to Columns, Data Types, and Functions

In Lesson 1 you learned to pull columns out of a table. Now you will go a level deeper: understanding what kind of data each column holds and transforming that data inside your queries. SQL lets you do arithmetic on numbers, reshape text, round values, and combine fields — all without ever changing the underlying database.

This is where queries start to feel powerful. Instead of just reading what is stored, you compute new columns on the fly: sales tax, profit margin, price per unit, formatted addresses, and more.

By the end of this lesson, you will be able to:

  • Identify SQL data types and explain why they matter
  • Inspect a table’s metadata using PRAGMA table_info()
  • Perform arithmetic on numeric columns to derive new fields
  • Understand integer division and fix it with CAST
  • Transform numbers and text with ROUND(), UPPER(), and LOWER()
  • Combine fields with the concatenation operator and add constant columns

You should already be comfortable with SELECT, FROM, LIMIT, and aliases. Let’s begin.

Data for this lesson

Database: superstore.db — retail orders from a Superstore (SQLite).

Tables used: orders

Open it in any SQLite client — DB Browser for SQLite, the sqlite3 CLI, or Python’s sqlite3 — to run every query yourself.


Understanding Data Types

For consistency and efficiency, a SQL column holds only one type of data. An integer column holds only integers, a text column holds only text, and so on. These categories are formally called storage classes, though everyone (including us) usually just calls them data types.

The Types SQLite Uses

SQLite works with a small set of types:

  • INTEGER — whole numbers, like the values in the quantity field.
  • REAL — floating-point (decimal) numbers, like the values in the sales and profit fields.
  • TEXT — strings of characters, like order_date, order_id, and product_name.

A surprising detail: SQLite has no dedicated date type. Dates such as order_date are stored as TEXT.

Another subtle one: postal_code is stored as TEXT, not a number. US ZIP codes can begin with a zero — 06057 is a real code in Connecticut. If it were stored as a number, that leading zero would vanish and the data would be wrong. This is a great reminder that the “right” data type depends on how the values are used, not just whether they look numeric.

Here is the orders table’s REAL-valued fields previewed together:

SELECT sales, discount, profit
  FROM orders
 LIMIT 5;
salesdiscountprofit
16.4480.25.5512
3.540.8-5.487
11.7840.24.2717
272.7360.2-64.7748
19.5360.24.884

Inspecting Metadata with PRAGMA

Guessing a column’s type by eye is unreliable. Fortunately, SQLite can tell you directly with a special command:

PRAGMA table_info(returns);

Unlike SELECT, which retrieves data from a table, PRAGMA table_info() retrieves metadata — information about the table. It returns one row per column:

cidnametypenotnulldflt_valuepk
0order_idTEXT0NULL0
1returnedTEXT0NULL0

The returns table is small — just two columns: the order_id and a returned flag. The name and type columns of the metadata are the ones you will care about most. The others (notnull, dflt_value, pk) describe constraints that go beyond this module. To check the types in the orders table, you would run PRAGMA table_info(orders); — a reliable way to confirm, for example, that postal_code really is TEXT.


Doing Math on Numeric Columns

A core analyst task is deriving new columns from existing ones using arithmetic. SQL supports +, -, *, and / directly inside a SELECT.

Calculating a New Field

Suppose you want to estimate 7% sales tax on every order. You multiply the sales column by 0.07:

SELECT order_id,
       sales,
       sales * 0.07
  FROM orders
 LIMIT 5;
order_idsalessales * 0.07
US-2023-10380016.4481.15136
US-2023-1123263.540.2478
US-2023-11232611.7840.82488
US-2023-112326272.73619.09152
US-2023-14181719.5361.36752

Note that you use 0.07, not 7% — SQL cannot work with percent signs, so you convert percentages to decimals. The new column is automatically named sales * 0.07, which is ugly. Fix it with an alias, just like in Lesson 1:

SELECT order_id,
       sales,
       sales * 0.07 AS sales_tax
  FROM orders
 LIMIT 5;

Now the computed column is cleanly labeled sales_tax. You can build any derived field this way — for example, profit margin is profit / sales.

The Integer Division Trap

Division in SQL has a quirk that catches almost everyone once. Try dividing two whole numbers:

SELECT 4 / 2 AS four_div_two;
four_div_two
2

That is correct. But now:

SELECT 3 / 2 AS three_div_two;
three_div_two
1

You expected 1.5, but got 1. This is integer division: when both operands are integers, SQL throws away the fractional part. To get a true decimal result, at least one operand must be a REAL number. You force that with the CAST function, whose syntax is CAST(value AS new_datatype):

SELECT CAST(3 AS REAL) / 2 AS three_div_two;
three_div_two
1.5

Now you get the answer you wanted. Keep this in mind whenever you divide one integer column by another — for instance, dividing quantity by 2 to check whether customers buy even or odd amounts:

SELECT product_id,
       quantity,
       CAST(quantity AS REAL) / 2 AS even_or_odd
  FROM orders
 LIMIT 5;

Without the CAST, every result would be rounded down to a whole number and the decimal .5 clues would disappear.


Functions on Numbers: ROUND()

In addition to arithmetic operators, SQL provides functions that transform values. The first is ROUND(), which rounds a number to a chosen number of decimal places.

Cleaning Up Decimals

The sales-tax query from earlier produced messy numbers like 19.09152 — far too many digits for dollars and cents. ROUND() fixes that. Its syntax is:

ROUND(value, decimal_places)

The first argument is the number to round; the second is how many decimal places to keep (omit it to round to a whole number). Apply it to both sales and the tax calculation:

SELECT order_id,
       ROUND(sales, 2)         AS rounded_sales,
       ROUND(sales * 0.07, 2)  AS rounded_sales_tax
  FROM orders
 LIMIT 5;
order_idrounded_salesrounded_sales_tax
US-2023-10380016.451.15
US-2023-1123263.540.25
US-2023-11232611.780.82
US-2023-112326272.7419.09
US-2023-14181719.541.37

Much cleaner. You can wrap any expression in ROUND(), including a division — for example, ROUND(sales / quantity, 2) gives a tidy price per unit.


Functions on Text: UPPER() and LOWER()

SQL also transforms text. Two of the most common functions are UPPER(), which converts a string to all uppercase, and LOWER(), which converts it to all lowercase.

Normalizing and Formatting Text

These functions shine in two situations: normalizing inconsistent data (when the same value arrives as “Texas”, “texas”, and “TEXAS”) and formatting output for a report. Here is UPPER() applied to the ship_mode column:

SELECT ship_mode             AS original_ship_mode,
       UPPER(ship_mode)      AS ship_mode_upper
  FROM orders
 LIMIT 5;
original_ship_modeship_mode_upper
Standard ClassSTANDARD CLASS
Standard ClassSTANDARD CLASS
Standard ClassSTANDARD CLASS
Standard ClassSTANDARD CLASS
Standard ClassSTANDARD CLASS

LOWER() works the same way in reverse. For example, to standardize customer names that people typed inconsistently online, you would write SELECT LOWER(customer_name) AS customer_name_lower FROM orders;. As always, these functions change only the query’s output — the original orders data is untouched.


Combining and Adding Columns

The final tools in this lesson let you stitch text together and invent constant columns.

Concatenating Fields

To concatenate means to join pieces of text into one. SQLite uses the double-pipe operator (||). Suppose you want to combine city and state_province into a single “Houston, Texas” style label:

SELECT city || ', ' || state_province AS location
  FROM orders
 LIMIT 5;
location
Houston, Texas
Naperville, Illinois
Naperville, Illinois
Naperville, Illinois
Philadelphia, Pennsylvania

Notice the literal text ', ' in the middle — a comma and a space inside quotes. You can chain as many pieces as you like. To build a store name like “Superstore Dallas”, you would write 'Superstore ' || city (the trailing space inside the quotes keeps the words apart).

Adding a Constant Column

Sometimes you want a column that shows the same value for every row — a flat fee, a fixed salary, a flag. Just put the value in the SELECT clause and alias it:

SELECT sales, 2 AS promotional_discount
  FROM orders
 LIMIT 5;
salespromotional_discount
16.4482
3.542
11.7842
272.7362
19.5362

Every row gets a 2. This is handy for adding a fixed shipping cost or a standard salary figure (for example, SELECT *, 51000 AS salary FROM people;) to a result set.

Tying It All Together

You now have everything you need to build a genuinely useful query. Imagine your supervisor wants each order’s address plus a total cost that adds 7% tax and a flat $4.99 shipping fee on top of sales:

SELECT city || ', ' || state_province || ' ' || postal_code   AS address,
       sales,
       ROUND(sales * 0.07, 2)                                  AS tax,
       4.99                                                    AS shipping_cost,
       ROUND(sales + 4.99 + sales * 0.07, 2)                   AS total_cost
  FROM orders
 LIMIT 10;
addresssalestaxshipping_costtotal_cost
Houston, Texas 7709516.4481.154.9922.59
Naperville, Illinois 605403.540.254.998.78
Naperville, Illinois 6054011.7840.824.9917.6
Naperville, Illinois 60540272.73619.094.99296.82
Philadelphia, Pennsylvania 1914319.5361.374.9925.89
Henderson, Kentucky 424202573.82180.174.992758.98
Henderson, Kentucky 424205.480.384.9910.85
Athens, Georgia 3060512.780.894.9918.66
Henderson, Kentucky 42420609.9842.74.99657.67
Henderson, Kentucky 4242031.122.184.9938.29

This single query uses concatenation (the address), a text-and-number mix, ROUND() (tax and total), a constant column (shipping), and arithmetic (total cost). One important detail: inside total_cost you must repeat the raw calculations rather than reuse the tax or shipping_cost aliases — you will learn exactly why aliases cannot be reused like that in the next lesson on filtering.


Practice Exercises

Use the orders table for each exercise.

Exercise 1: Profit Margin

Write a query that returns order_id, sales, profit, and a calculated field called profit_margin (profit divided by sales). Limit the results to 8 rows.

Hint

Use the / operator: profit / sales AS profit_margin. Both columns are REAL, so integer division is not a problem here.

Exercise 2: Price Per Unit, Rounded

Write a query returning order_id, sales, quantity, and a price_per_unit field that divides sales by quantity and rounds the result to 2 decimal places. Limit it to 10 rows.

Hint

Put the division inside ROUND(): ROUND(sales / quantity, 2) AS price_per_unit.

Exercise 3: Build a Store Label

Write a query that returns order_id, region, state_province, and a new column local_store that concatenates the word “Superstore” with the city, with one space between them (e.g. “Superstore Dallas”). Limit to 10 rows.

Hint

Use the || operator and put the trailing space inside the quotes: 'Superstore ' || city AS local_store.


Summary

You moved beyond simply reading data to inspecting and transforming it. You learned how SQL classifies data into types, how to read a table’s metadata, how to compute new numeric fields (and avoid the integer-division trap), and how to reshape numbers and text with built-in functions.

Key Concepts

  • Data types (storage classes) — INTEGER, REAL, and TEXT; SQLite stores dates and ZIP codes as TEXT.
  • PRAGMA table_info(table) — returns metadata: each column’s name, type, and constraints.
  • Arithmetic operators+, -, *, / build derived columns; convert percentages to decimals.
  • Integer division — dividing two integers drops the fraction; use CAST(value AS REAL) to fix it.
  • ROUND(value, places) — rounds a number to a set number of decimal places.
  • UPPER() and LOWER() — convert text to uppercase or lowercase for normalization or formatting.
  • Concatenation (||) — joins strings, including literal text, into one column.
  • Constant columns — put a literal value in SELECT to repeat it on every row.
  • Scalar functions change the query output but never the stored data.

Why This Matters

Raw data almost never arrives in the exact shape your analysis needs. Real columns are messy: inconsistent capitalization, too many decimal places, values split across fields, or units that need converting. The functions in this lesson — ROUND, UPPER, LOWER, CAST, and concatenation — are your everyday toolkit for cleaning and reshaping data right inside a query, without touching the source. Mastering them means you can hand off polished, report-ready results instead of raw rows, which is exactly what makes an analyst valuable.


Next Steps

Continue to Lesson 3 - Filtering Rows with Numbers

Narrow results with comparison operators, handle NULLs, and combine conditions with BETWEEN, IN, AND, and OR

Back to Module Overview

Return to the Getting Started with SQL module overview


Continue Building Your Skills

You can now compute, round, and reshape data on the fly — a huge leap from just reading rows. These transformation skills will appear in nearly every query you write from here on. Keep experimenting: try combining a ROUND(), a concatenation, and a constant column in a single query and watch how much cleaner your results become.