awk to print record not equal specific pattern

how to use "awk" to print any record has pattern not equal ? for example my file has 5 records & I need to get all lines which $1=10 or 20 , $2=10 or 20 and $3 greater than "130302" as it shown :

10  20  1303252348212B030 
20  10  1303242348212B030 
40  34  1303252348212B030 
10  20  1303032348212B030 
10  20  1303022348212B030 

I have tried below

awk '( $1 ~/[12]0/  ||  $2 ~/[12]0 ) && $3 ! ~/130302/ { print $0 }'  my file

I think awk is confused with the digits after 130302XXXXXX !!!!

First of all I didn't understand your requirement clearly.

But I noticed few syntax error in your code:

awk '( $1 ~ /[12]0/  ||  $2 ~ /[12]0/ ) && $3 !~ /130302/ { print $0 }' myfile
1 Like

dear as far as I know nothing wrong with backslash / and not equal ! =
my request is to get following output :

10  20  1303252348212B030 
20  10  1303242348212B030 
10  20  1303032348212B030 

from below input

10  20  1303252348212B030 
20  10  1303242348212B030 
40  34  1303252348212B030 
10  20  1303032348212B030 
10  20  1303022348212B030
$ awk '($1 ~ /[12]0/ || $2 ~ /[12]0/) && substr ($3, 0, 6) > 130302 { print }' input
10 20 1303252348212B030
20 10 1303242348212B030
10 20 1303032348212B030

Ok so there is nothing wrong with backslash:

awk '$2 ~ /[12]0' myfile
 syntax error The source line is 1.
 The error context is
                $2 ~ >>>  /[12]0 <<<

And there is nothing wrong with regexp not matching:

awk '$2 ! ~/[12]0' myfile
 syntax error The source line is 1.
 The error context is
                $2 ! >>>  ~ <<<

I hope then you have some explanation for above behavior?

1 Like

can you explain

substr ($3, 0, 6) > 130302

---------- Post updated at 05:04 PM ---------- Previous update was at 05:02 PM ----------

do this

awk '$2 ~ /[12]0/' myfile

and this

awk '$2 ! ~/[12]0/' myfile

you still have error message ?

Try nawk instead of awk. On some systems, plain awk is very, very old.

1 Like

nawk not found in AIX

I think there is some confusion. I was merely trying to correct the syntax error in the original post:

If you check, OP missed a slash and there was a blank space ! ~ instead of !~ .

So this has nothing to do with nawk or primitive awk

1 Like

Maybe simply type this in?

awk '($1=="10" || $1=="20") && ($2=="10" || $2=="20") && $3>"130302" {print}' myfile

Or modify it a bit like this?

awk '($1=="10" || $1=="20") && ($2=="10" || $2=="20") && $3>="130303" {print}' myfile
1 Like

Sure, substr is a function that returns a "substring".

$3 - obviously, that's the field to work on.

0 - starting position. Properly, should have been 1, since awk numbers starting from 1 for this particular thing. It does not mind 0, but 1 is the correct lowest starting position.

6 - length of substring to return.

-------------------

! ~ normally needs to be together like !~ to work correctly. My gnu awk requires that, anyway.

Hope this helps. :slight_smile:

1 Like

but how to do that ? I want to Filter out any record dose belong to �263� in $4 and $6 is before 130304 ?

2    9647701612350         9647701168456         262       23        1303031257462B0300 1303031259182B0300 92        9647701146402  0
5    9647706046060         9647801139306         263       32        1303031255312B0300 1303031259182B0300 227       9647701146402  0
6    9647706046060         9647801139306         263       32        130325255312B0300 1303251259182B0300 227       9647701146402  0
8    9647701612350         9647701168456         22       262        1303111257462B0300 1303111259182B0300 92        9647701146402  0
9    9647701612350         9647701168456         262      700        1303131257462B0300 1303131259182B0300 92        9647701146402  0
10   9647701612350         9647701168456         22       263        1303031257462B0300 1303031259182B0300 92        9647701146402  0
12   9647706046060         9647801139306         263       32        130303255312B0300 1303032259182B0300 227       9647701146402  0

the output must be

2    9647701612350         9647701168456         262       23        1303031257462B0300 1303031259182B0300 92        9647701146402  0
6    9647706046060         9647801139306         263       32        130325255312B0300 1303251259182B0300 227       9647701146402  0
8    9647701612350         9647701168456         22       262        1303111257462B0300 1303111259182B0300 92        9647701146402  0
9    9647701612350         9647701168456         262      700        1303131257462B0300 1303131259182B0300 92        9647701146402  0
10   9647701612350         9647701168456         22       263        1303031257462B0300 1303031259182B0300 92        9647701146402  0

Here are two equivalent ways:

$ awk '$4 == 263 && substr ($6, 1, 6) < 130304 { next } { print }' input
2    9647701612350         9647701168456         262       23        1303031257462B0300 1303031259182B0300 92        9647701146402  0
6    9647706046060         9647801139306         263       32        130325255312B0300 1303251259182B0300 227       9647701146402  0
8    9647701612350         9647701168456         22       262        1303111257462B0300 1303111259182B0300 92        9647701146402  0
9    9647701612350         9647701168456         262      700        1303131257462B0300 1303131259182B0300 92        9647701146402  0
10   9647701612350         9647701168456         22       263        1303031257462B0300 1303031259182B0300 92        9647701146402  0
$ awk '$4 != 263 || substr ($6, 1, 6) >= 130304 { print }' input
2    9647701612350         9647701168456         262       23        1303031257462B0300 1303031259182B0300 92        9647701146402  0
6    9647706046060         9647801139306         263       32        130325255312B0300 1303251259182B0300 227       9647701146402  0
8    9647701612350         9647701168456         22       262        1303111257462B0300 1303111259182B0300 92        9647701146402  0
9    9647701612350         9647701168456         262      700        1303131257462B0300 1303131259182B0300 92        9647701146402  0
10   9647701612350         9647701168456         22       263        1303031257462B0300 1303031259182B0300 92        9647701146402  0

thanks I manage to solve it using below code , you may want to take a look and tell me whether correct or not ?

awk !'($4 ~/263/  && $6 <=130304) {print }'  input  

Here is the input you most recently posted:

$ cat input
2    9647701612350         9647701168456         262       23        1303031257462B0300 1303031259182B0300 92        9647701146402  0
5    9647706046060         9647801139306         263       32        1303031255312B0300 1303031259182B0300 227       9647701146402  0
6    9647706046060         9647801139306         263       32        130325255312B0300 1303251259182B0300 227       9647701146402  0
8    9647701612350         9647701168456         22       262        1303111257462B0300 1303111259182B0300 92        9647701146402  0
9    9647701612350         9647701168456         262      700        1303131257462B0300 1303131259182B0300 92        9647701146402  0
10   9647701612350         9647701168456         22       263        1303031257462B0300 1303031259182B0300 92        9647701146402  0
12   9647706046060         9647801139306         263       32        130303255312B0300 1303032259182B0300 227       9647701146402  0

Here is what the code you just posted produces (I removed stray ! character):

$ awk '($4 ~/263/  && $6 <=130304) {print }'  input
5    9647706046060         9647801139306         263       32        1303031255312B0300 1303031259182B0300 227       9647701146402  0
12   9647706046060         9647801139306         263       32        130303255312B0300 1303032259182B0300 227       9647701146402  0

Does it make those two lines when you run it? If those two lines are the ones you want, in a sense it's correct. On the other hand, the output differs from what you posted as expected output in your last previous post, which had five output lines. In that sense, it seems not correct.

When you said "filter out" before, the five expected output lines suggested you meant "exclude", "not print". Maybe the definition has changed, and you want to "print" if $4 == 263 and $6 <= 130304. Perhaps the problem is getting better defined. It would not hurt to post another input file, and the expected output, if you would like.

Since $6 is such a big field, I think it makes more sense to compare with the substr of $6 that just has the first six numbers, and not depend on some odd way it might compare 130304 with the much longer field.

1 Like