JP
/
EN

Regex: Remove lines that match the conditions

SummaryRegex
Lines with only line breaks^$(\r\n|\r|\n)?
Lines with only line breaks and spaces^\s*?$(\r\n|\r|\n)?
Lines not containing fixed keyword^(?!.*keyword).*$(\r\n|\r|\n)?
Lines not containing fixed keywords^(?!.*(keyword1|keyword2)).*$(\r\n|\r|\n)?
Lines not start with fixed keyword^(?!keyword).*$(\r\n|\r|\n)?
Lines not end with fixed keyword^(?!.*keyword$).*$(\r\n|\r|\n)?
Lines containing keyword^.*keyword.*$(\r\n|\r|\n)?
Lines containing keywords^.*(keyword1|keyword2).*$(\r\n|\r|\n)?
Lines start with fixed keyword^keyword.*$(\r\n|\r|\n)?
Lines end with fixed keyword^.*keyword$(\r\n|\r|\n)?
* All newline codes (CRLF, CR, LF) are covered.

This is a regular expression that deletes lines from text that match any criteria. Such as removing blank lines, lines containing keywords, and lines that do not contain keywords.

I often use it for log analysis. Searching for any regular expression in the table above and then performing a bulk replace will remove all unnecessary lines.

How to use regex

  1. Open the file in a text editor capable of using regex.
  2. Set the regex of the table to the search string.
  3. Set an empty string for the replace string.
  4. Execute “Replace All”.

For example, suppose you have the following logs.

INFO  2018-01-31 15:00:00.000 1234/process1 message
ERROR 2018-01-31 15:00:00.000 1234/process1 message
INFO  2018-01-31 15:00:00.000 4321/process2 message
INFO  2018-01-31 15:00:00.000 1234/process1 message
INFO  2018-01-31 15:00:00.000 4321/process2 message
INFO  2018-01-31 15:00:00.000 1234/process1 message
INFO  2018-01-31 15:00:00.000 4321/process2 message
INFO  2018-01-31 15:00:00.000 4321/process3 message
INFO  2018-01-31 15:00:00.000 4321/process3 message
INFO  2018-01-31 15:00:00.000 1234/process1 message

If you want to see only process1, process2 and process3 will get in the way. Use the regular expression “Lines not containing fixed keyword” in the table to solve this problem.

^(?!.*process1).*$(\r\n|\r|\n)?

Let’s use this regular expression for bulk replacement.

INFO  2018-01-31 15:00:00.000 1234/process1 message
ERROR 2018-01-31 15:00:00.000 1234/process1 message
INFO  2018-01-31 15:00:00.000 1234/process1 message
INFO  2018-01-31 15:00:00.000 1234/process1 message
INFO  2018-01-31 15:00:00.000 1234/process1 message

All logs except process1 are gone. Congratulations!

For more information on metacharacters, click here.
Regex: Metacharacters

I also made it into a tool.
Text Analytics Assistant

Hirota Yano / Japan / Programmer
I am publishing a web tool I created as a hobby. It is free of charge, so please feel free to use it.
© Hirota Yano