Regex for regular people

Christoph Geypen
3 min readMar 28, 2017

Regular expressions can be useful outside of programming. From creating segments in Google Analytics to web scraping or processing data in a spreadsheet. It’s a very interesting thing to learn.

Below I have creating a guide to get you started with regular expressions quickly.

Cheat sheet

To get you started it is handy to have a cheat sheet ready so you can easily reference the syntax when needed. I like this one from DaveChild.

What do you need to know

  1. Regex works line by line. So if you search for something in a html file it will search for it in line 1 and will go to the next line and so on. For programmers this will sound logical. For a marketer who is only familiar with ctrl+f this is important to know.

1> this is a searchquery
2> searchquery
3> blabla
Regex “searchquery” will match the words in line 1 and 2 and not in 3.

2. In regex you define every character in the query you are looking for:

searchquery is a regex because you are looking for the
letters s e a r c h q u e r y in that order.

3. You can use wildcards, groups or character ranges (see cheat sheet) if a character can vary. The dot is a common wildcard which means every character.

wildcard: so .earchquery means aearchquery, cearchquery, 4earchquery, searchquery. (literally every character can be on place of the dot)

groups: grouping is done with braces and a pipe character to split elements.
(beer|wine)bottle searches for beerbottle or winebottle..

character range: a character range are controlled wildcards. You are not sure which character will appear but they are in a range. [a-z] will select all lower case letters from a-z. [az] will select the letter a or z (like a group but only for one character).

4. Next you can define a quantifier. A quantifier defines how many times a character can appear.

* is a common one. .* (dot *) means every character imaginable times 0 or infinite (the * will stop working until there is a last match) . So s.*query matches searchquery. (the * stops at the q from query). Watch out because sasdfdsfsadsdafsdquery will also match.

5. You can play with anchors to define where in the line your searchquery has to appear.

^ means beginning of line, $ end of line. So:
balblalab searchquery
^searchquery will not match since it’s not the beginning of the line.
searchquery$ will match.

6. Quantifiers define how many times the element before the quantifier needs to be multiplied.

1{3} equals 111 (quantify times 3)

Once you get the hang of the basics you can hone your skills further as much as needed.

Testing and learning

Testing of regular expressions can be tested easily on https://regex101.com/. Besides documentation you can input your own regular expressions and test it.

An other way of testing can be done within google sheets using the formula =regexextract(). Here you select the cell on which you want to extract a certain string.

Lastly you can learn regex using this interactive course: https://regexone.com/

Have fun and if any questions just let me know!

--

--

Christoph Geypen

Christoph's specialty lies in online marketing and market research.In his free time he likes discovering new things, going out and strength training.