GAWK case insensitive comparison

Hi :o
I'm working on Windows, with gawk,
and let's say I have two files to compare.
searching for a script to do a text comparison I came across to this line:

gawk "{if(NR==FNR){A[$0]}else{B[$0]}}END{for(x in A){if(!(x in B))print x>\"1not2.txt\"}for(x in B){if(!(x in A))print x>\"2not1.txt\"}}" x1 x2 

Which works perfectly for what I need, but I want to do a case insensitive comparison,

File1:

ABCD

File2:

abcd

So that shouldn't bring any difference, I tried to use the IGNORECASE=1 option but I am too new to GAWK, I would appreciate if anyone could help me,

Thanks in advance, :b:

how did you use ?

 
gawk -v IGNORECASE=1 "....bla...bla..."

I tried many ways,

gawk -v IGNORECASE=1 "{if(NR==FNR){A[$0]}else{B[$0]}}END{for(x in A){if(!(x in B))print x>\"1not2.txt\"}for(x in B){if(!(x in A))print x>\"2not1.txt\"}}" x1 x2

just creates both 1not2.txt and 2not1.txt

gawk -v IGNORECASE=1 "{if(NR==FNR){A[$0]}else{B[$0]}}END{for(x in A){if(!(x in B))print x>\"1not2.txt\"}for(x in B){if(!(x in A))print x>\"2not1.txt\"}}" x1 x2
type 1not2.txt
ABCD
type 2not1.txt
abcd
 

Moved to windows forum...

The value of IGNORECASE does not affect array subscripting.

You should use something like this:

{
  _0 = tolower($0)
  if (NR == FNR)
    A[_0]
  else 
    B[_0]
  }
1 Like

Thank you, I'm afraid it's not working or I'm doing something wrong, I get this when trying to run:

gawk " {  _0 = tolower($0)  if (NR == FNR)   A[_0]  else     B[_0]  }END{for(x in A){if(!(x in B))print x>\"1not2.txt\"}for(x in B){if(!(x in A))print x>\"2not1.txt\"}}" x1 x2
 
gawk:  {  _0 = tolower($0)  if (NR == FNR)   A[_0]  else     B[_0]  }END{for(x in A){if(!(x in B))print x>"1not2.txt"}for(x in B){if(!(x in A))print x>"2not1.txt"}}
gawk:                       ^ syntax error
gawk:  {  _0 = tolower($0)  if (NR == FNR)   A[_0]  else     B[_0]  }END{for(x in A){if(!(x in B))print x>"1not2.txt"}for(x in B){if(!(x in A))print x>"2not1.txt"}}
gawk:                                               ^ syntax error
errcount: 2

You may notice that my code was formatted differently (the new lines matter).
If you want to write it on a single line, you'll need to use a semi-colon:

_0 = tolower($0);  if (NR == FNR)   A[_0];  else     B[_0] ...

I'd recommend using a script file, instead of supplying the code
on the command line (especially when using Windows ...).

1 Like

I understand now, I was using everything on a script but not using the ; separator:

This is the script:

 
 
del 1not2.txt
del 2not1.txt
gawk "{  _0 = tolower($0);  if (NR == FNR)    A[_0];  else     B[_0]  }END{for(x in A){if(!(x in B))print x>\"1not2.txt\"}for(x in B){if(!(x in A))print x>\"2not1.txt\"}}" x1 x2
type 1not2.txt
type 2not1.txt

And it's working perfectly,
Thank you very much Radoulov!!:o