awk if conditional, multiple lines

david_seeds · January 18, 2012, 6:08pm

Hi experts,

I have a file with hundreds of lines that looks something like this:

Cake	1	3	4	2	3	1	3
Onion	3	2	4	1	3	4	2
Apple	2	3	4	4	1	2	1
Orange	4	3	4	1	2	4	1
Cake	3	1	4	2	4	2	1
Onion	2	4	2	1	4	2	1
Apple	4	3	3	3	3	2	1
Orange	2	1	2	2	2	4	1
Cake	4	3	4	3	1	3	3
...

I'm trying to list all the row numbers, except for "bad" lines with the following criteria:
a) row numbers in which the last column meets some threshold, for instance >1.5
AND
b) The next 3 rows after the row that is excluded in a)

For instance, in the sample above, the desired output would be:

6,7,8

So far, I'm able to get the "bad" lines with:

awk 'BEGIN {ORS=","}
{k=1.7}
{if($8>k)
print NR, NR+1, NR+2, NR+3
}' inputfile

but I'm a little lost on how to get the opposite, or "good" lines. This also can be redundant, and it prints extra row values if the last row meets the "bad" line criteria. It seems like the wrong direction to go.

Would I have to go through the file twice, once to flag "bad" lines and again to print "good" lines (by excluding "bad" lines)?

Hope this makes sense. Thanks for the help!

jim_mcnamara · January 18, 2012, 8:12pm

awk 'BEGIN{ok=0} 
       $(NR)>1.7 {ok=3; next} 
       {if(ok>0) {ok--; next}
         else { print $0}  }' inputfile

Assuming I got what you want, try that.

david_seeds · January 18, 2012, 9:20pm

ah, brilliants, works great. didn't think of using "--"!

thanks!