String Functions
Extract, search, format, and modify text.
Introduction
String manipulation is an essential aspect of database management, especially when working with data cleaning, data preperation, or data manipulation. MS SQL provides a rich set of string functions to process and transform text data efficiently. These functions allow you to extract, search, format, and modify text within SQL queries.
In this section, we’ll cover common string functions with examples..
Tables
Customers
Orders
Common String Functions
LEN()
Returns the number of characters in a string, excluding trailing spaces.
Example: Retrieve the length of each customer’s first name.
SELECT first_name, LEN(first_name) AS NameLength
FROM Customers;
SUBSTRING()
Extracts a specific part of a string, starting at a defined position, for a defined length.
Example: Extract the first three characters of the customer's name (e.g., initials).
SELECT first_name, SUBSTRING(first_name, 1, 3) AS Initials
FROM Customers;
CHARINDEX()
andPATINDEX()
Finds the starting position of a substring within a string. It returns 0
if the substring is not found.
Example: Find the position of the word "Inc" in company names.
SELECT first_name, CHARINDEX('joh', first_name) AS IncPosition
FROM Customers;
PATINDEX(pattern, string) returns the position of a pattern match in a string. It allows for wildcard characters, unlike CHARINDEX().
LEFT()
andRIGHT()
LEFT()
extracts characters from the beginning of a string, and RIGHT()
extracts characters from the end.
Example: Extract the first five characters (abbreviation) from a customer’s name.
SELECT first_name, LEFT(first_name, 5) AS Abbreviation
FROM Customers;
Example: Extract the last four digits from a customer’s phone number.
SELECT Phone, RIGHT(Phone, 4) AS LastFourDigits
FROM Customers;
UPPER()
andLOWER()
Converts the text to uppercase or lowercase.
Example: Display all customer names in uppercase for consistency.
SELECT UPPER(first_name) AS first_nameCaps
FROM Customers;
Example: Convert product codes to lowercase for standardization.
SELECT LOWER(product_code) AS StandardizedCode
FROM Products;
REPLACE()
Replaces all occurrences of a substring with another substring.
Example: Replace all instances of '-' with ' ' in phone.
SELECT first_name, REPLACE(phone, '-', ' ') AS Updatedphone
FROM Customers;
LTRIM()
andRTRIM()
Removes leading spaces with LTRIM()
and trailing spaces with RTRIM()
.
Example: Remove extra spaces from customer names.
SELECT LTRIM(RTRIM(first_name)) AS TrimmedName
FROM Customers;
CONCAT()
Combines two or more strings into a single string.
Example: Combine the first and last name of customers into a full name.
SELECT first_name, last_name, CONCAT(first_name, ' ', last_name) AS full_name
FROM Customers;
REVERSE()
Reverses the order of characters in a string.
Example: Display customer names in reverse order (useful for unique identifiers or testing).
SELECT first_name, REVERSE(first_name) AS reversed_name
FROM Customers;
STRING_SPLIT()
STRING_SPLIT() is used to split a string into multiple rows based on a specified delimiter.
Example: Suppose you have a list of Products stored in a single column as a comma-separated string. You want to split the products into individual rows.
SELECT value
FROM STRING_SPLIT('P1,P2,P3', ',');
STRING_AGG()
STRING_AGG() is used to concatenate string values from multiple rows into a single string, separated by a specified delimiter.
Example: Suppose you have a table of Customers where each customer is associated with multiple orders. You want to concatenate the order IDs for each customer.
SELECT customer_id, STRING_AGG(order_id, ', ') AS Orders
FROM Orders
GROUP BY customer_id;
STRING_AGG()
ignores NULL values. It concatenates only the non-NULL values.
Practical Use Cases of string function
- Standardizing Names
Ensure that all customer names are stored in a consistent format by converting them to uppercase and removing extra spaces.
SELECT UPPER(LTRIM(RTRIM(first_name))) AS Standardizedfirst_name
FROM Customers;
- Searching by Partial Strings
Often, a search might involve a partial string. Use CHARINDEX()
to find customers whose names contain specific keywords.
SELECT first_name
FROM Customers
WHERE CHARINDEX('Smith', first_name) > 0;
- Masking Sensitive Information
Display only partial information for sensitive data, such as showing only the last four digits of a customer’s social security number.
SELECT CONCAT('***-**-', RIGHT(SSN, 4)) AS MaskedSSN
FROM Customers;
- Data Cleanup
Use REPLACE()
to clean up common data entry errors, such as converting common abbreviations.
UPDATE Customers
SET Address = REPLACE(Address, 'St.', 'Street')
WHERE CHARINDEX('St.', Address) > 0;
Conclusion
String functions in MS SQL provide powerful tools to manipulate, clean, and format text-based data. They are essential for standardizing customer data, improving search functionality, and cleaning up inconsistent or incorrectly entered data. By mastering these functions, you can enhance your SQL queries' performance and usability, particularly when handling complex text fields in a datasets.
Asked in Interview
- What is the difference between CHARINDEX() and PATINDEX()?
- How do you concatenate two strings in SQL?
- How do you extract a substring from a string in SQL?
- How do you remove characters from a string?
- How do you remove leading and trailing spaces from a string?
- How do you find the position of a substring in a string in SQL?
- Can you convert a row into column using string function?