Regular Expressions

Contact Us or call 1-877-932-8228
Regular Expressions

Regular Expressions

  • A regular expression is a text string that uses specific characters to describe a search pattern.
  • They are a powerful tool for searching and processing text data in a concise manner.
  • Each regular expression represents a set of possible strings.
  • It consists of character literals and metacharacters.

There are a number of special characters that can be used when constructing a regular expression.

The letters from a to z can be represented as a-z|A-Z or [[:alpha:]].

Parenthesis can be used to represent grouping.

Repitition of characters or subpatterns can be characterized by numbers or symbols. For instance, a * represents zero or more occurences of a string.

Groups that are matched can be refenced by their position (\1 for instance) which is useful when used in conjunction with replacing groups within strings.

The three regular expression functions perform in a similar manner to their counterparts REPLACE, INSTR and LIKE.

  • REGEXP_REPLACE
  • REGEXP_INSTR
  • REGEXP_LIKE

Code Sample:

SQL-Functions-Character/Demos/regexp_instr.sql
SELECT * 
FROM locations 
WHERE REGEXP_INSTR(street_address,'[0-9]') = 0;

SELECT * 
FROM locations 
WHERE REGEXP_INSTR(street_address,'Street| St') > 0;

The locations' table street_address column contains relatively unstructured data which is often addressed effectively using regular expressions. The first query returns rows where the street address does not contain a digit between zero and nine. The second query returns locations with the word "Street" or the abbreviation "St".

Code Sample:

SQL-Functions-Character/Demos/regexp_like.sql
SELECT * 
FROM locations 
WHERE street_address LIKE '%-%'
OR street_address LIKE '%(%' 
OR street_address LIKE '%,%';

SELECT * 
FROM locations 
WHERE REGEXP_LIKE(street_address,'-|\(|,');

SELECT * 
FROM locations 
WHERE REGEXP_LIKE(street_address,'[[:digit:]]$');

Regular expressions can be useful for describing strings to match in a terse manner. The first query illustrates how one might use a series of LIKE clauses to find all columns that contain either a dash, left-parenthesis or a comma. The next query returns the same rows, but is much shorter. The pipe is used for alternation, and the backslash is required as an escape sequence. The last query returns records that have a digit as the last character in the street_address column.

Code Sample:

SQL-Functions-Character/Demos/regexp_replace.sql
SELECT REGEXP_REPLACE(street_address, '[[:alpha:]]', 'x') AS result
FROM locations;

SELECT REGEXP_REPLACE(phone_number, '[[:digit:]]', '9') AS result
FROM employees;

These two queries use the REGEXP_REPLACE function to effectively obscure some portion of the data returned. In the first query, all alpha characters in the street_address are replaced by x's. In the second query, all numeric characters in the phone number are replaced with nines. For a listing of Oracle Regular Expressions go to : Regular Expressions

Next