What is Regex? Common Regular Expressions for SEOs
Regular expressions, or simply “regex”, is a programming language made up of strings of characters that let you identify patterns—like complex search strings, partial matches, or case-in-sensitive searches—for text-based searches.
For marketers, regex comes in handy across a multitude of use cases in data analytics, such as when needing to create advanced filters in Google Analytics and Search Console, running custom crawls in Screaming Frog, and much more.
As such, they help you analyze entire sets of data that, at first glance, may appear to have little in common with one another and let you only see the data you want to see.
In this article, let’s delve into some common regex operators to boost your data-analyzing capabilities with regex for SEO!
Common Regex Operators
Here’s a quick overview of popular regex operators used in the search marketing world and their descriptions.
Regex Operator | Description |
. | The dot represents any single character as a wildcard.
For example, you can use the regex 1. to filter for the numbers 10-19. |
* | An asterisk after a character represents either the absence of that character or one or more instances of it.
For example, the regex he*llo will generate results like hllo, hello and heeello. |
+ |
A plus sign following a character represents one or more instances of that character. For example, the regex he+llo will generate results like hello and heeello, but not hllo.
|
.* |
Combining the dot and asterisk lets you match zero or more random characters in a string. For example, if you run an online shop called Alligator, you can use the regex .*Alligator.* to return all queries that mention your brand.
|
.+ | Combining the dot and plus sign lets you match one or more random characters in a string.
For example, the regex aus.+ will generate results that include austria, australia, aussie and aust, but not aus. |
| | The pipe means “or”.
For example, you can use the regex operator pizza|fries if you’d like to obtain all data that relate to pizza, fries, or both. |
^ | The caret denotes the beginning of a string.
For example, use the regex operator ^box if you’re looking for data that starts with the word “box”. |
$ | The dollar sign denotes the end of a string.
For example, use the regex operator box$ if you’re looking for data ending with the word “box”. |
() | Parentheses allow you to group characters together and nest them within a longer regex.
For example, the regex operator /products/(sneakers|shoes)/ will return product pages for sneakers or shoes. |
? | The question mark indicates that the character before it is optional.
For example, if users tend to misspell your brand name Alligator, you can use the regex operator All?igator|Alligg?ator to include common misspelled variations of your brand, such as Aligator and Alliggator. |
\ | What if you want to filter for patterns that include a special character in the string that happens to be a regex operator? You can add a \ before the special character to negate its function as a regex operator and let it exist as the character itself.
For example, you can use 123\.45\.678\.90 to negate the effect of the dot as regex operators when trying to obtain the specific IP address of 123.45.678.90. |
{} | Curly brackets let you find repetitive characters and specify the number of repetitions.
For example, the regex xyz{2} will match the “z” character exactly two times and return the results for xyzz, while the regex xyz{2,4} will match the “z” character at least two but no more than four times, and will return results xyzz, xyzzz and xyzzzz. |
[] | Square brackets represent a character set and let you match one out of several characters placed between square brackets. Since it will only match a single character in the set, the sequence of characters within the square bracket is irrelevant.
For example, [abc] would return any string that has any of the characters a, b, or c present, and [123] would return any string with the characters 1, 2, or 3 present. |
– | Dashes are often used within square brackets to create a more advanced list.
For example, [a-z] would match any lowercase letter. [A-Z] matches any uppercase letter. [0-9] would match any digit between 0-9. [0-1][0-9] would match any string that includes a two-digit number between 00 to 19, while [a-zA-Z] would return any string that includes any alphabet between a to z, case-insensitive. |
WANT DIGITAL INSIGHTS STRAIGHT TO YOUR INBOX?
How to Use Regex
For SEOs, there are several platforms where we can deploy regex to help us optimize our workflow, including Google Analytics, Google Search Console, Screaming Frog, and Looker Studio (formerly Google Data Studio).
Using Regex in Google Analytics
SEOs can create advanced filters in Google Analytics using regex. This lets you exclude specific results and view those you are interested in seeing.
Common use cases are the exclusion of data from specific IP addresses and the filtering of URLs to analyze the performance of selected subfolders.
While you can use the pipe character (“|”) to create an “or” expression, you may not be able to represent “and” in a single regex. However, you can add another filter in Google Analytics to stack the expressions to process them as a single logical “and” statement.
Using Regex in Google Search Console
In 2021, Google Search Console (GSC) began supporting the use of regex to help users filter data. This is particularly helpful for including or excluding specific types of queries, like queries that contain variations of your brand name.
While GSC has a limit of 4,096 characters, it is relatively generous, considering you can usually condense the patterns to shorten the regex.
Using Regex in Screaming Frog
Screaming Frog crawls can take up a lot of time, especially for larger websites. If you instead would like to crawl specific subfolders, subdomains, or pages, regex comes in handy in the Exclude and Include features within the tool.
Using Regex in Looker Studio
If you have already integrated Google Analytics and Google Search Console data into Looker Studio, you can use regex to filter data in the Looker Studio dashboard and create custom reports.
***
Implementing the right regex in your reporting can help you better analyze the performance of each of your search campaigns.
Now that you understand the basics of regex and its usefulness, it’s time to try it out! If you’re still unfamiliar with regex, you can always test them to see if they are working the way you want them to.
Ready to maximize your digital marketing in Asia?
Digital Marketing News + Insights
This article has been updated by Helena Xiao in 2024.