Search & Replace using Regular Expressions

This is the third part of my blog series on regular expressions. In the first two parts I introduced basic and complex regular expressions. Now it is time to use them in practice. This post will show you how to use regular expressions in search and replace operations.  The upcoming parts will explain the usage of regular expressions in multiple programming languages.

Search & Replace

If you are working with large textfiles you may run into a specific problem a lot: replacing a string that varies a little from time to time. One example could be a log file. Another one could be a big .csv file. Maybe you want to change the format of all dates. You can not include every possible date in a simple replace operation. Imagine a .json file with thousands of entries like this:

[
    {
        "date": "20171231"
    },
    {
        "date": "20160131"
    },
    {
        "date": "20050615"
    },
    {
        "date": "19990101"
    },
    {
        "date": "19630523"
    },
    {
        "date": "19780811"
    }
]

Assume you want to change the date format to mm/dd/yyyy. That is a task where regular expression excel. I chose to use Visual Studio Code for these type of problems. The standard search window offers regular expressions. Just activate them by clicking the marked symbol in the screenshot. 

Now you need a regular expression that uses groups and matches the substrings you want to edit. I chose the following RegEx to find the dates:

/"((19|20)[0-9]{2})(0[0-9]|1[0-2])([0-2][0-9]|3[0-1])"/

As you can see I limited the possible year range from 1900 to 2099 as I am fairly certain that my document won't contain any dates outside that range. The following screenshot shows I my RegEx worked in Visual Studio Code.

 Activate regular expressions in the Visual Studio Code search

Activate regular expressions in the Visual Studio Code search

 The regular expression matched all the dates in the .json file.

The regular expression matched all the dates in the .json file.

To change the date format you need to replace the matches with other content. But since you want to reuse the existing date you have to reference the groups (or submatches). You can do this with the dollar sign ($). A great way to test if you replace strategy works is an online RegEx editor like regexr.com. It offers several tools including a replace tool. The following screenshot shows my replacement strategy.

 Use tools like regexr.com to test your replacement strategy

Use tools like regexr.com to test your replacement strategy

As you can see in the picture the replacement works. I reference the groups 1, 3 and 4 to get the year, the month and the date. Now I can use this replacement strategy to edit my .json file. See how Visual Studio Code executes the replacement.

 The file in Visual Studio Code before ...

The file in Visual Studio Code before ...

 ... and after the replace operation

... and after the replace operation

With this knowledge you can speed up a lot of tasks involving large text files without a single line of code. This conclude the first post about regular expressions in practice. If you have any ideas on how to improve this post just leave a comment or tweet at me.