Understanding NULL Values in Databases
A NULL value represents the absence of data — not zero, not an empty string, not a space. It means “no value was provided.” Optional columns, incomplete forms, or deferred data entry commonly produce NULLs.
Key Distinction
- 0 is a numeric value.
- '' (empty string) is a textual value.
- NULL means truly unknown / not supplied.
Testing for NULL Values
You cannot use =, !=, <, etc., with NULL. Use IS NULL or IS NOT NULL.
Syntax
-- Rows where email is missing
SELECT id, name, email
FROM Users
WHERE email IS NULL;
-- Rows where email exists
SELECT id, name, email
FROM Users
WHERE email IS NOT NULL;
Why special operators?
NULL participates in three‑valued logic (TRUE/FALSE/UNKNOWN). email = NULL yields UNKNOWN, not TRUE, so the row is excluded — hence dedicated operators.
Handling NULL Safely
- COALESCE(expr, fallback): first non‑NULL value.
- NULLIF(a,b): returns NULL if a = b (avoid divide by zero).
- Set defaults: use sensible DEFAULT constraints for common columns.
- Avoid ambiguous meaning: never store empty string when you mean unknown; use NULL.
-- Display a fallback label when nickname is NULL
SELECT id, COALESCE(nickname, 'No nickname') AS display_name
FROM Profiles;
-- Prevent division by zero using NULLIF
SELECT total / NULLIF(count_items, 0) AS avg_value
FROM Metrics;
Aggregation Functions: COUNT, SUM, AVG, MIN, MAX
Aggregate functions summarize many rows into a single result. They ignore NULL (except COUNT(*)) to keep stats accurate.
-- Basic aggregates
SELECT
COUNT(*) AS total_rows,
COUNT(price) AS priced_rows, -- skips NULL price
SUM(price) AS total_revenue,
AVG(price) AS avg_price,
MIN(price) AS cheapest,
MAX(price) AS most_expensive
FROM Products;
NULL prices are excluded from COUNT(price), SUM, AVG, etc. Use COALESCE(price,0) to treat missing values as zero if business rules allow.
GROUP BY: Forming Squads
Use GROUP BY to aggregate per category.
-- Total revenue per category
SELECT category, SUM(price) AS revenue
FROM Products
GROUP BY category;
-- Average price per (category, brand)
SELECT category, brand, AVG(price) AS avg_price
FROM Products
GROUP BY category, brand;
Every non-aggregated column in the SELECT list must appear in the GROUP BY list.
HAVING: Filtering Groups
HAVING filters aggregated results after grouping (like WHERE for groups).
-- Only categories with revenue over 1000
SELECT category, SUM(price) AS revenue
FROM Products
GROUP BY category
HAVING SUM(price) > 1000
ORDER BY revenue DESC;
-- Categories averaging price over 50
SELECT category, AVG(price) AS avg_price
FROM Products
GROUP BY category
HAVING AVG(price) > 50;
Avoid HAVING for row-level filters; apply those in WHERE first (e.g., WHERE active = 1).
Challenge: NULLs + Aggregations
Given a table Orders(order_id, customer_id, total_amount, discount, status) where discount may be NULL:
- Calculate total revenue treating NULL discount as 0.
- Show average discount per status (grouping by status).
- List statuses with at least 50 orders and average total_amount > 120.
-- 1) Revenue with discount fallback
SELECT SUM(total_amount - COALESCE(discount,0)) AS net_revenue
FROM Orders;
-- 2) Average discount per status
SELECT status, AVG(COALESCE(discount,0)) AS avg_discount
FROM Orders
GROUP BY status;
-- 3) High-performing statuses
SELECT status,
COUNT(*) AS order_count,
AVG(total_amount) AS avg_total
FROM Orders
GROUP BY status
HAVING COUNT(*) >= 50 AND AVG(total_amount) > 120
ORDER BY avg_total DESC;
Stretch: Exclude cancelled orders in step 3 (WHERE status != 'cancelled') before grouping.
More Resources
← Back to Database Home