Handling NULL Values & Aggregations

Learn how NULL works, how to test it safely, and level up with GROUP BY, HAVING, ORDER BY, and aggregation functions.

Tap a topic to jump to its card.

Understanding NULL Values in Databases

A NULL value represents the absence of data — not zero, not an empty string, not a space. It means “no value was provided.” Optional columns, incomplete forms, or deferred data entry commonly produce NULLs.

Key Distinction

Testing for NULL Values

You cannot use =, !=, <, etc., with NULL. Use IS NULL or IS NOT NULL.

Syntax

-- Rows where email is missing
SELECT id, name, email
FROM Users
WHERE email IS NULL;

-- Rows where email exists
SELECT id, name, email
FROM Users
WHERE email IS NOT NULL;

Why special operators?

NULL participates in three‑valued logic (TRUE/FALSE/UNKNOWN). email = NULL yields UNKNOWN, not TRUE, so the row is excluded — hence dedicated operators.

Handling NULL Safely

-- Display a fallback label when nickname is NULL
SELECT id, COALESCE(nickname, 'No nickname') AS display_name
FROM Profiles;

-- Prevent division by zero using NULLIF
SELECT total / NULLIF(count_items, 0) AS avg_value
FROM Metrics;

Aggregation Functions: COUNT, SUM, AVG, MIN, MAX

Aggregate functions summarize many rows into a single result. They ignore NULL (except COUNT(*)) to keep stats accurate.

-- Basic aggregates
SELECT
  COUNT(*)           AS total_rows,
  COUNT(price)       AS priced_rows,     -- skips NULL price
  SUM(price)         AS total_revenue,
  AVG(price)         AS avg_price,
  MIN(price)         AS cheapest,
  MAX(price)         AS most_expensive
FROM Products;

NULL prices are excluded from COUNT(price), SUM, AVG, etc. Use COALESCE(price,0) to treat missing values as zero if business rules allow.

GROUP BY: Forming Squads

Use GROUP BY to aggregate per category.

-- Total revenue per category
SELECT category, SUM(price) AS revenue
FROM Products
GROUP BY category;

-- Average price per (category, brand)
SELECT category, brand, AVG(price) AS avg_price
FROM Products
GROUP BY category, brand;

Every non-aggregated column in the SELECT list must appear in the GROUP BY list.

HAVING: Filtering Groups

HAVING filters aggregated results after grouping (like WHERE for groups).

-- Only categories with revenue over 1000
SELECT category, SUM(price) AS revenue
FROM Products
GROUP BY category
HAVING SUM(price) > 1000
ORDER BY revenue DESC;

-- Categories averaging price over 50
SELECT category, AVG(price) AS avg_price
FROM Products
GROUP BY category
HAVING AVG(price) > 50;

Avoid HAVING for row-level filters; apply those in WHERE first (e.g., WHERE active = 1).

Challenge: NULLs + Aggregations

Given a table Orders(order_id, customer_id, total_amount, discount, status) where discount may be NULL:

  1. Calculate total revenue treating NULL discount as 0.
  2. Show average discount per status (grouping by status).
  3. List statuses with at least 50 orders and average total_amount > 120.
-- 1) Revenue with discount fallback
SELECT SUM(total_amount - COALESCE(discount,0)) AS net_revenue
FROM Orders;

-- 2) Average discount per status
SELECT status, AVG(COALESCE(discount,0)) AS avg_discount
FROM Orders
GROUP BY status;

-- 3) High-performing statuses
SELECT status,
       COUNT(*) AS order_count,
       AVG(total_amount) AS avg_total
FROM Orders
GROUP BY status
HAVING COUNT(*) >= 50 AND AVG(total_amount) > 120
ORDER BY avg_total DESC;

Stretch: Exclude cancelled orders in step 3 (WHERE status != 'cancelled') before grouping.

More Resources

← Back to Database Home