JP
/
EN

Text Analytics Assistant (Filtering, Sort, Remove Duplicate) / Tool

0 lines
0 lines

Filtering

Only lines that
the keywords.

Remove lines

Sort order

Ascending/Descending = UTF-16 order

This is a tool that runs in the browser and manipulates text line by line to assist in analysis. For logging and data analysis. Line count, filtering, sorting, duplicate line removal, etc.

  • Converts automatically when entering text or changing settings.
  • Keywords: You can enter multiple keywords separated by a new line. In that case, filtering is done by OR.
  • Remove blank lines: Remove lines containing only whitespace and line feed codes.
  • Remove duplicate lines: Remove duplicate lines except for the first occurrence.

Filtering by Keyword

This is useful when you want to narrow down logs and other data for analysis.

e.g. Suppose you have a log text like this, and you want to see only process1.

INFO  2019-01-31 15:00:00.000 1234/process1 message
ERROR 2019-01-31 15:00:00.000 1234/process1 message
INFO  2019-01-31 15:00:00.000 4321/process2 message
INFO  2019-01-31 15:00:00.000 1234/process1 message
INFO  2019-01-31 15:00:00.000 4321/process2 message
INFO  2019-01-31 15:00:00.000 4321/process3 message
INFO  2019-01-31 15:00:00.000 1234/process1 message

Select “contain” & enter “process1” as a keyword…

INFO  2019-01-31 15:00:00.000 1234/process1 message
ERROR 2019-01-31 15:00:00.000 1234/process1 message
INFO  2019-01-31 15:00:00.000 1234/process1 message
INFO  2019-01-31 15:00:00.000 1234/process1 message

All lines except process1 have disappeared. Yay!

By the way, filtering is realized by regular expressions. In the case of “only lines that contain the keywords”, “lines that do not contain keywords” is removed using the following regular expression.

/^(?!.*(keyword1|keyword2)).*$(\r\n|\r|\n)?/gm

Regular expressions are introduced in the following articles.
Regex: Remove lines that match the conditions

Filtering

Ascending/Descending

  • Ascending: Sort in order of decreasing size or in order of determined data.
  • Descending: Sort in order of size or in reverse order of determined data.

Sort according to character code (UTF-16) order. In ascending order, the order is “symbols (half), numbers (half), alphabet (half), Characters (full), symbols (full), numbers (full), alphabet (full)“.

+ (half)
- (half)
1 (half)
2 (half)
A (half)
B (half)
あ (full)
い (full)
ア (full)
イ (full)
亜 (full)
腕 (full)
+ (full)
- (full)
1 (full)
2 (full)
A (full)
B (full)

Character count (Asc/Desc)

Sort by number of characters (not bytes).

Reverse

Reverses the order of the current input fields.

Hirota Yano / Japan / Programmer
I am publishing a web tool I created as a hobby. It is free of charge, so please feel free to use it.
© Hirota Yano