awk for comparing two files

Jaymz · October 17, 2011, 3:25pm

so have file1 like this:

joe 123
jane 456

and then file2 like this:

123 left right
456 up down
joe ding dong
jane flip flop

what I need to do is compare col1 and col2 in file1 with col1 in file2 and generate a new file that has lines like this:

joe 123 ding dong left right
jane 456 flip flop up down

what I did so far is this:

# this grabs the 'left' value
awk 'NR==FNR{a[$1]=$2;next} {print $1,$2,a[$2]}' file2 file1 >> file3
# this grabs the 'right' value
awk 'NR==FNR{a[$1]=$3;next} {print $1,$2,$3,a[$2]}' file2 file3 >> file4

which generates this line:

joe 123 left right

but I am stuck on grabbing joe's 'ding dong' and also, am I doing this right with multiple awk lines, or can it be made more compact ?

radoulov · October 17, 2011, 4:03pm

awk 'END {
  for (i = 0; ++i <= FNR;) 
    print r
  }
NR == FNR {
  f[$1] = substr($0, index($0, $2)) 
  next
  }
{ 
  for (i = 0; ++i <= NF;)
    $i in f && r[FNR] = (r[FNR] ? r[FNR] : $0) \
      FS f[$i]
  }' file2 file1

Jaymz · October 17, 2011, 4:33pm

thanks. sorry for not putting the outputs in code tags as well.
where do I place an IF check to skip lines like this in file1:

joe joe (skip)
joe 123 (keep)
123 123 (skip)

radoulov · October 17, 2011, 4:38pm

awk 'END {
  for (i = 0; ++i <= FNR;) 
    if (i in r) print r
  }
NR == FNR {
  f[$1] = substr($0, index($0, $2)) 
  next
  }
$1 != $2 { 
  for (i = 0; ++i <= NF;)
    $i in f && r[FNR] = (r[FNR] ? r[FNR] : $0) \
      FS f[$i]
  }' file2 file1

Jaymz · October 17, 2011, 4:51pm

I think I'm starting to love awk ! it's awesome

rdcwayx · October 17, 2011, 8:25pm

Another way.

awk 'NR==FNR{c[$1 FS $2];next} {a[$1]=a[$1] FS $2 FS $3}
    END{for ( i in c) {split(i,x,FS);if (x[1]!=x[2]) print i,a[x[1]],a[x[2]]}}' file1 file2

Jaymz · October 23, 2011, 2:51am

thanks, rdcwayx, both solutions worked fine.

now I want to loop through both files and for each row I want to check if file1's 3th column equals file2's 4th column then append 5th column from file1 at the end of all the rows it matched in file2. example:

file1:
john doe 123 left right

file2:
address1 phone1 email1 123
address2 phone2 email2 123

output:
address1 phone1 email1 123 right
address2 phone2 email2 123 right

ctsgnb · October 23, 2011, 7:27am

By respect for people who are helping you please give all your requirements at once, so people could provide an accurrate answer from the beginning.

Jaymz · October 23, 2011, 12:03pm

ctsgnb, as you see from the date, first issue was a week ago, and this new thing I have to do is a different one even though it seems similar to you

anyone, please ?

ctsgnb · October 23, 2011, 1:27pm

nawk 'NR==FNR{A[$3]=$5;next}{print $0 (($4 in A)?FS A[$4]:z)}' file1 file2

---------- Post updated at 07:27 PM ---------- Previous update was at 07:26 PM ----------

$ cat f1
john doe 123 left right
$ cat f2
address1 phone1 email1 123
address2 phone2 email2 123
$ nawk 'NR==FNR{A[$3]=$5;next}{print $0 (($4 in A)?FS A[$4]:z)}' f1 f2
address1 phone1 email1 123 right
address2 phone2 email2 123 right

rdcwayx · October 23, 2011, 7:31pm

awk 'NR==FNR{a[$3]=$5;next}{print $0,a[$4]}' file1 file2

Jaymz · October 23, 2011, 10:40pm

thanks to both of you