Regex: Remove lines that match the conditions
Summary | Regex |
---|---|
Lines with only line breaks | ^$(\r\n|\r|\n)? |
Lines with only line breaks and spaces | ^\s*?$(\r\n|\r|\n)? |
Lines not containing fixed keyword | ^(?!.*keyword).*$(\r\n|\r|\n)? |
Lines not containing fixed keywords | ^(?!.*(keyword1|keyword2)).*$(\r\n|\r|\n)? |
Lines not start with fixed keyword | ^(?!keyword).*$(\r\n|\r|\n)? |
Lines not end with fixed keyword | ^(?!.*keyword$).*$(\r\n|\r|\n)? |
Lines containing keyword | ^.*keyword.*$(\r\n|\r|\n)? |
Lines containing keywords | ^.*(keyword1|keyword2).*$(\r\n|\r|\n)? |
Lines start with fixed keyword | ^keyword.*$(\r\n|\r|\n)? |
Lines end with fixed keyword | ^.*keyword$(\r\n|\r|\n)? |
This is a regular expression that deletes lines from text that match any criteria. Such as removing blank lines, lines containing keywords, and lines that do not contain keywords.
I often use it for log analysis. Searching for any regular expression in the table above and then performing a bulk replace will remove all unnecessary lines.
How to use regex
- Open the file in a text editor capable of using regex.
- Set the regex of the table to the search string.
- Set an empty string for the replace string.
- Execute “Replace All”.
For example, suppose you have the following logs.
INFO 2018-01-31 15:00:00.000 1234/process1 message
ERROR 2018-01-31 15:00:00.000 1234/process1 message
INFO 2018-01-31 15:00:00.000 4321/process2 message
INFO 2018-01-31 15:00:00.000 1234/process1 message
INFO 2018-01-31 15:00:00.000 4321/process2 message
INFO 2018-01-31 15:00:00.000 1234/process1 message
INFO 2018-01-31 15:00:00.000 4321/process2 message
INFO 2018-01-31 15:00:00.000 4321/process3 message
INFO 2018-01-31 15:00:00.000 4321/process3 message
INFO 2018-01-31 15:00:00.000 1234/process1 message
If you want to see only process1, process2 and process3 will get in the way. Use the regular expression “Lines not containing fixed keyword” in the table to solve this problem.
^(?!.*process1).*$(\r\n|\r|\n)?
Let’s use this regular expression for bulk replacement.
INFO 2018-01-31 15:00:00.000 1234/process1 message
ERROR 2018-01-31 15:00:00.000 1234/process1 message
INFO 2018-01-31 15:00:00.000 1234/process1 message
INFO 2018-01-31 15:00:00.000 1234/process1 message
INFO 2018-01-31 15:00:00.000 1234/process1 message
All logs except process1 are gone. Congratulations!
For more information on metacharacters, click here.
Regex: Metacharacters
I also made it into a tool.
Text Analytics Assistant