Text Analytics Assistant (Filtering, Sort, Remove Duplicate) / Tool
Filtering
Remove lines
Sort order
This is a tool that runs in the browser and manipulates text line by line to assist in analysis. For logging and data analysis. Line count, filtering, sorting, duplicate line removal, etc.
- Converts automatically when entering text or changing settings.
- Keywords: You can enter multiple keywords separated by a new line. In that case, filtering is done by OR.
- Remove blank lines: Remove lines containing only whitespace and line feed codes.
- Remove duplicate lines: Remove duplicate lines except for the first occurrence.
Filtering by Keyword
This is useful when you want to narrow down logs and other data for analysis.
e.g. Suppose you have a log text like this, and you want to see only process1.
INFO 2019-01-31 15:00:00.000 1234/process1 message
ERROR 2019-01-31 15:00:00.000 1234/process1 message
INFO 2019-01-31 15:00:00.000 4321/process2 message
INFO 2019-01-31 15:00:00.000 1234/process1 message
INFO 2019-01-31 15:00:00.000 4321/process2 message
INFO 2019-01-31 15:00:00.000 4321/process3 message
INFO 2019-01-31 15:00:00.000 1234/process1 message
Select “contain” & enter “process1” as a keyword…
INFO 2019-01-31 15:00:00.000 1234/process1 message
ERROR 2019-01-31 15:00:00.000 1234/process1 message
INFO 2019-01-31 15:00:00.000 1234/process1 message
INFO 2019-01-31 15:00:00.000 1234/process1 message
All lines except process1 have disappeared. Yay!
By the way, filtering is realized by regular expressions. In the case of “only lines that contain the keywords”, “lines that do not contain keywords” is removed using the following regular expression.
/^(?!.*(keyword1|keyword2)).*$(\r\n|\r|\n)?/gm
Regular expressions are introduced in the following articles.
Regex: Remove lines that match the conditions
Filtering
Ascending/Descending
- Ascending: Sort in order of decreasing size or in order of determined data.
- Descending: Sort in order of size or in reverse order of determined data.
Sort according to character code (UTF-16) order. In ascending order, the order is “symbols (half), numbers (half), alphabet (half), Characters (full), symbols (full), numbers (full), alphabet (full)“.
+ (half)
- (half)
1 (half)
2 (half)
A (half)
B (half)
あ (full)
い (full)
ア (full)
イ (full)
亜 (full)
腕 (full)
+ (full)
- (full)
1 (full)
2 (full)
A (full)
B (full)
Character count (Asc/Desc)
Sort by number of characters (not bytes).
Reverse
Reverses the order of the current input fields.