String Functions

Extract, search, format, and modify text.

Introduction

String manipulation is an essential aspect of database management, especially when working with data cleaning, data preperation, or data manipulation. MS SQL provides a rich set of string functions to process and transform text data efficiently. These functions allow you to extract, search, format, and modify text within SQL queries.

In this section, we’ll cover common string functions with examples..

Common String Functions

  1. LEN()

Returns the number of characters in a string, excluding trailing spaces.

Example: Retrieve the length of each customer’s first name.

SELECT first_name, LEN(first_name) AS NameLength
FROM Customers;
  1. SUBSTRING()

Extracts a specific part of a string, starting at a defined position, for a defined length.

Example: Extract the first three characters of the customer's name (e.g., initials).

SELECT first_name, SUBSTRING(first_name, 1, 3) AS Initials
FROM Customers;
  1. CHARINDEX() and PATINDEX()

Finds the starting position of a substring within a string. It returns 0 if the substring is not found.

Example: Find the position of the word "Inc" in company names.

SELECT first_name, CHARINDEX('Inc', first_name) AS IncPosition
FROM Customers
WHERE CHARINDEX('Inc', first_name) > 0;

PATINDEX(pattern, string) returns the position of a pattern match in a string. It allows for wildcard characters, unlike CHARINDEX().

  1. LEFT() and RIGHT()

LEFT() extracts characters from the beginning of a string, and RIGHT() extracts characters from the end.

Example: Extract the first five characters (abbreviation) from a customer’s name.

SELECT first_name, LEFT(first_name, 5) AS Abbreviation
FROM Customers;

Example: Extract the last four digits from a customer’s phone number.

SELECT Phone, RIGHT(Phone, 4) AS LastFourDigits
FROM Customers;
  1. UPPER() and LOWER()

Converts the text to uppercase or lowercase.

Example: Display all customer names in uppercase for consistency.

SELECT UPPER(first_name) AS first_nameCaps
FROM Customers;

Example: Convert product codes to lowercase for standardization.

SELECT LOWER(product_code) AS StandardizedCode
FROM Products;
  1. REPLACE()

Replaces all occurrences of a substring with another substring.

Example: Replace all instances of 'LLC' with 'Limited' in company names.

SELECT first_name, REPLACE(first_name, 'LLC', 'Limited') AS UpdatedName
FROM Customers;
  1. LTRIM() and RTRIM()

Removes leading spaces with LTRIM() and trailing spaces with RTRIM().

Example: Remove extra spaces from customer names.

SELECT LTRIM(RTRIM(first_name)) AS TrimmedName
FROM Customers;
  1. CONCAT()

Combines two or more strings into a single string.

Example: Combine the first and last name of customers into a full name.

SELECT CONCAT(first_name, ' ', last_name) AS full_name
FROM Customers;
  1. REVERSE()

Reverses the order of characters in a string.

Example: Display customer names in reverse order (useful for unique identifiers or testing).

SELECT first_name, REVERSE(first_name) AS reversed_name
FROM Customers;
  1. FORMAT()

Formats a string according to a specified format, often used with dates or numbers.

Example: Format customer phone numbers as (###) ###-####.

SELECT FORMAT(Phone, '(###) ###-####') AS FormattedPhone
FROM Customers;
  1. STRING_SPLIT() STRING_SPLIT() is used to split a string into multiple rows based on a specified delimiter.

Example: Suppose you have a list of Products stored in a single column as a comma-separated string. You want to split the products into individual rows.

SELECT value
FROM STRING_SPLIT('P1,P2,P3', ',');
  1. STRING_AGG()

STRING_AGG() is used to concatenate string values from multiple rows into a single string, separated by a specified delimiter.

Example: Suppose you have a table of Customers where each customer is associated with multiple orders. You want to concatenate the order IDs for each customer.

SELECT CustomerID, STRING_AGG(OrderID, ', ') AS Orders
FROM Orders
GROUP BY CustomerID;

STRING_AGG() ignores NULL values. It concatenates only the non-NULL values.

Practical Use Cases of string function

  1. Standardizing Names

Ensure that all customer names are stored in a consistent format by converting them to uppercase and removing extra spaces.

SELECT UPPER(LTRIM(RTRIM(first_name))) AS Standardizedfirst_name
FROM Customers;
  1. Searching by Partial Strings

Often, a search might involve a partial string. Use CHARINDEX() to find customers whose names contain specific keywords.

SELECT first_name 
FROM Customers
WHERE CHARINDEX('Smith', first_name) > 0;
  1. Masking Sensitive Information

Display only partial information for sensitive data, such as showing only the last four digits of a customer’s social security number.

SELECT CONCAT('***-**-', RIGHT(SSN, 4)) AS MaskedSSN
FROM Customers;
  1. Data Cleanup

Use REPLACE() to clean up common data entry errors, such as converting common abbreviations.

UPDATE Customers 
SET Address = REPLACE(Address, 'St.', 'Street')
WHERE CHARINDEX('St.', Address) > 0;

Conclusion

String functions in MS SQL provide powerful tools to manipulate, clean, and format text-based data. They are essential for standardizing customer data, improving search functionality, and cleaning up inconsistent or incorrectly entered data. By mastering these functions, you can enhance your SQL queries' performance and usability, particularly when handling complex text fields in a datasets.

Asked in Interview

  • What is the difference between CHARINDEX() and PATINDEX()?
  • How do you concatenate two strings in SQL?
  • How do you extract a substring from a string in SQL?
  • How do you remove characters from a string?
  • How do you remove leading and trailing spaces from a string?
  • How do you find the position of a substring in a string in SQL?
  • Can you convert a row into column using string function?