Handling NULL Values

Null values in data.

Introduction

In SQL, NULL represents the absence of a value, not to be confused with a zero, an empty string, or other placeholder values. Understanding how to handle NULL values is crucial for database professionals because NULLs can affect query results, logic, and even performance.

This section will walk through how to work with NULLs effectively, covering key functions, operators, and techniques to ensure your SQL queries behave as expected when NULLs are involved.

What is NULL in SQL?

In SQL, NULL represents "unknown" or "no data." When a column in a database table has no value assigned, SQL uses NULL to represent this lack of information.

Example: In a Customer table, if a customer's phone is not provided, the value will be stored as NULL.

SELECT customer_id, phone 
FROM Customers;

Dealing with NULLs in Queries

a. Checking for NULL: IS NULL and IS NOT NULL

NULL cannot be compared using standard equality operators (= or !=). Instead, SQL provides IS NULL and IS NOT NULL to check for NULL values.

Example: Find all customers without a phone number.

SELECT customer_id 
FROM Customers
WHERE phone IS NULL;

Example: Find all customers who have a phone number.

SELECT customer_id 
FROM Customers
WHERE phone IS NOT NULL;

b. Handling NULL in WHERE Clauses

If a column contains NULLs, queries might not behave as expected without proper NULL handling. For example, the condition WHERE column != 'value' will not return rows where the column is NULL because NULL != 'value' evaluates to unknown.

Example:

SELECT * 
FROM Orders
WHERE Status <> 'Shipped';  -- NULL rows will not be returned

In this case, you should explicitly include NULL handling:

SELECT * 
FROM Orders
WHERE Status <> 'Shipped' OR Status IS NULL;

Using COALESCE and ISNULL

a. COALESCE() Function

The COALESCE() function returns the first non-NULL value from a list of expressions. It's particularly useful when you want to substitute NULL values with a default.

Example: Replace NULL phone numbers with 'N/A'.

SELECT customer_id, COALESCE(phone, 'N/A') AS phone
FROM Customers;

b. ISNULL() Function (SQL Server specific)

Similar to COALESCE(), ISNULL() is used to replace NULL with a specified value but only takes two arguments (the expression and the replacement).

Example: Replace NULL values in a salary column with 0.

SELECT EmployeeID, ISNULL(Salary, 0) AS Salary
FROM Employees;

NULL and Aggregate Functions

Aggregate functions such as COUNT(), SUM(), and AVG() behave differently when dealing with NULL values.

a. COUNT()

  • COUNT(column) will ignore NULL values.
  • COUNT(*) includes NULLs since it counts all rows.

Example:

SELECT COUNT(phone), COUNT(*)
FROM Customers;

b. SUM() and AVG()

NULL values are ignored in SUM() and AVG().

Example:

SELECT SUM(total_sum), AVG(total_sum)
FROM Orders;

If any rows have NULL for the salary, they won’t be included in the result.

Handling NULL in Joins

When performing LEFT JOIN or RIGHT JOIN operations, NULLs often appear in the result when there is no matching record in the joined table.

Example:

SELECT Orders.order_id, Customers.customer_name
FROM Orders
LEFT JOIN Customers ON Orders.customer_id = Customers.customer_id;

In this case, if an order doesn’t have a corresponding customer, customer_name will be NULL in the result set.

NULL with String Functions

Some string functions return NULL if any argument is NULL.

Example: Concatenating a NULL value with a string using + results in NULL.

SELECT 'Customer: ' + customer_name 
FROM Customers;

If customer_name is NULL, the result will be NULL. You can avoid this by using COALESCE():

SELECT 'Customer: ' + COALESCE(customer_name, 'Unknown')
FROM Customers;

NULL in Conditional Logic

SQL's conditional logic can become tricky when NULLs are involved. The CASE expression is often used to handle NULLs.

Example: Use a CASE statement to label NULL values as "Unknown".

SELECT customer_id, 
       CASE 
         WHEN phone IS NULL THEN 'Unknown' 
         ELSE phone 
       END AS phone
FROM Customers;

NULL and Unique Constraints

Columns defined as UNIQUE can contain only one NULL value. This means if a column allows NULLs, multiple rows can contain NULL in that column, but only one will be allowed.

Example: Suppose Email is a unique field:

CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    Email VARCHAR(255) UNIQUE
);

Here, multiple customers can have NULL in the Email column, but two non-NULL emails cannot be the same.

Conclusion

Handling NULL values in SQL requires a good understanding of how NULL interacts with different SQL functions and operations. From querying data to using aggregates, and from conditional logic to joins, handling NULLs correctly can ensure your queries return accurate and meaningful results.

Asked in Interview

  • What is a NULL value in SQL?
  • How do you check for NULL values in SQL?
  • What is the use of the ISNULL() or COALESCE() function?
  • What is the difference between = NULL and IS NULL?
  • How do NULL values affect aggregate functions like SUM(), AVG(), and MAX()?
  • How do you handle NULLs in string concatenation?
  • How do NULL values affect JOIN operations?
  • How do NULL values affect set operators like UNION, INTERSECT, and EXCEPT?
  • Can a UNIQUE constraint include NULL values?