Gawk on Windows: Joining lines only if 1st field matches

Hi..
i have two files::
file_1::

mOnkey
huMAn

file_2::

Human:hates:banana
i:like:***
Monkey:loves:banana
dogs:kill:cats

desired output::

Monkey:loves:banana
Human:hates:banana

so only when the 1st field matches from both files print it from file_2 ((case-sensitive))

i also would like to know how to match different fields..
file_1::

AA:11
BB:22

file_2::

first:AA:2011
second:BB:2012

desired output::

first:AA:2011:11
second:BB:2012:22

so only if field1 from file_1 matches with field2 from file_2 print whole file_2 joined with field2 from the same line in file_1

i hope u understand my cheesy english : P THANKx in advance

You say it's case sensitive, but the case of your example data doesn't always match. Do you actually mean case insensitive?

hmmm.. no i actually meant to follow file_2..
note:: it would be really great if some1 can explain both awk and join commands

join requires the files to be in the same order, so doesn't sound like what you need.

awk is an entire self-contained programming language for dealing with flat files, with powerful abilities to select columns, use associative arrays, and match regular expressions. In some ways it's even more powerful than perl, or at least more convenient -- columns are built deeply into the language in a way perl's mostly abandoned. But perl is a general-purpose language, and awk is not...

$ awk -F":" 'NR==FNR{A[$1]=$0; next}; $1 in A { print A[$1] }' file2 file1
Monkey:loves:banana
Human:hates:banana

$

@Corona68::
thanks.. but doesn't work with my files... nor with the examples given above..
i get no output at all..
im using gawk with windows.. maybe thats why??
i converted the quotes::

gawk -F":" "NR==FNR{A[$1]=$0; next}; $1 in A { print A[$1] }" 11.txt 22.txt

He used the files in reverse order. You mean, match case insensitive, but print case from file2...?

gawk -F":" "NR==FNR{a[tolower($1)]++;next}tolower($1) in a" 11.txt 22.txt

edit: i believe gawk even compiled on windows will not treat DOS line endings well. Try this next to remove them first

gawk -F":" "1{sub(/\r$/,"")}NR==FNR{a[tolower($1)]++;next}tolower($1) in a" 11.txt 22.txt
1 Like

@neutronscott::
actually the 1st one worked very well.. THANKS
now please.. could u try the other request

try

awk -F: 'FNR==NR{a[$1]=$2;next}$2 in a{print $0 FS a[$2]}'
1 Like

@neutronscott::
i dont know how it works but it does =D.. i guess i'll have to study it
thanks a bunch

---------- Post updated at 07:54 AM ---------- Previous update was at 06:07 AM ----------

@neutronscott::
would u mind explaining the last command briefly??

That would have been nice to know, yes. I've spent a lot of time fighting with gawk in windows because if CMD's ridiculous quoting problems.

LOL... im used to it..
i just wanted him to explain because i have a lot of similar cases.. and i dont like asking too much

Hi,

i am learning awk and have started looking into one liners shown in this forum. i have doubt in this. i have just expanded this oneliner for my understanding and trying to execute it but it throws error. please help me

bash-3.00# cat nawktest
#!/usr/bin/nawk -f
BEGIN {FS=":"}
if (FNR==NR)
{
a[$1]=$2;
next;
}
for ($2 in a)
{
print $0 FS a[$2];
}
bash-3.00# ./nawktest test.txt test1.txt
/usr/bin/nawk: syntax error at source line 3
 context is
         >>> if <<<  (FNR==NR)
/usr/bin/nawk: bailing out at source line 3
bash-3.00#

@chidori::
where did u get the "if" from?? there no any if in the one liner..

@chidori: you need to put an extra opening brace before the if and a closing one at the end.

finally this has been mastered::
this could be useful to some1::

gawk -F: "FNR==NR{a[($1)]=$2;next}$1 in a{print $0 FS a[$1]}"
                      ^ matching field file1             ^ matching field file2
                           ^ the field we want to append from file1
                                   ^ matching field file2(agian)

done this with many files.. and its SUPER! =D

@M@LIK, the parentheses are superfluous ( FNR==NR{a[$1]=$2 .. ).

So to be clear, this is gawk running straight on windows? This could never function on Unix since there should be single quotes around the awk statements, otherwise the shell will interpret the variables...

Moving to windows forum

@Scrutinizer::
O.o
it was about awk not windows or dos??
yea.. on windows you have to use double quotes.. unix single quote.. thats a minor thing
and yes its working very well.. u can try it on unix also after replacing the quotes.. the rest is the SAME
-_-

I wouldn't call cmd.exe a "shell" but it is fun putting quotes inside of your quotes. I suggest using files and gawk -f input.awk for any awk which involves a string on Windows. :slight_smile:

@Scrtinizer: I do think it should still be in the awk forum, since the original post wasn't pertaining to cmd.exe quoting issues.

@ scrutinizer

have done it but still getting the error..

#!/usr/bin/nawk -f
BEGIN {FS=":"}
{
if (FNR==NR)
{
a[$1]=$2;
next;
}
}

for ($2 in a) {
print $0 FS a[$2];
}

Try:

#!/usr/bin/nawk -f
BEGIN {FS=":"}
{
  if (FNR==NR)
  {
    a[$1]=$2;
    next;
  }
  for ($2 in a) {
    print $0 FS a[$2];
  }
}

indenting the code helps a lot by the way ...