Re: [opensuse] grep - how to tell which patterns are NOT in the searched file

27 Jun 2014

      On June 26, 2014 1:01:00 PM EDT, Anton Aylward <opensuse@antonaylward.com> wrote:
...
On 06/26/2014 12:40 PM, Greg Freemyer wrote:
...
All,
I know "grep -F -f pattern_file <files-to-search>
will give me a list of all lines with a pattern in them.
I want to know which of the patterns had no matches.
I tried "grep -F --count -f pattern_file <files-to-search>" but it
gives a cumulative count for all patterns
I want a count per pattern so I can find the patterns with 0 count.
Do I have to write a bash loop and call grep once for each pattern?
It depends.
You could try two variants.
The first is "-v".  What's not there.
The second is look at the return/exit code for a FAILure to find any of
the patterns.
It depends on a few details about the search you haven't mentioned.
Such as "many matches except for one' vs "none of the patterns".
The other thing you can do is not use grep.
My first thought is to use awk, but sometimes that has awk-ward sytax.
My second though is that this is an excellent example for perl to show
of, but that is predicted on you being familiar enough with perl.  I
was
once but not now.
http://www.theunixschool.com/2012/09/grep-vs-awk-examples-for-pattern-search...
This is interesting but a bit lame in places.
http://www.unix.com/shell-programming-and-scripting/183001-multiple-pattern-...
It doesn't mention that you hashed indexed arrays to keep a count of
which strings get matched.
This looks close
http://unix.stackexchange.com/questions/50491/the-simplest-method-to-count-l...
I did it the brute force way, but it is likely a recurring need.

I had 4 GB of text logs and about 100 patterns I needed to see which were not mentioned in the logs.  About 1100 lines came out of a Greg for all the patterns, so on average 10 or so lines per pattern.

I created a shell script with a line per pattern to see which found hits.  I scanned the hits and found 6 of the patterns with no hits.

In a more general sense I would like to get:

Pattern 1: count of lines containing pattern

For each pattern.  Grep can do that if I call grep once per pattern, but that is really wasteful of resources.

It surprises me there is no way to do this in a single grep call.

As to awk, I've written a lot of awk scripts in the past so that is likely what I will try first.

Greg
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org