Wrong result return from script

Hi Gurus,

I need a script to compare two files: sample file like below:
list:

cde,file4
cde,file5
def,file6
def,file7
def,file8
abc,file1
abc,file2
abc,file3
acd,file9
acd,file10

tmp

file1
file2
file3
file4
file5
file6

I want to compare tmp file file name with list file 2nd column, if file name in both file, then print entire record from list, if file name only exists in list then print "not found" plus the record. for above two file, I except get below result:

cde,file4
cde,file5
def,file6
Not Found def,file7
Not Found def,file8
abc,file1
abc,file2
abc,file3
Not Found acd,file9
Not Found acd,file10

I tried below script, but the output is wrong.

awk -F"," 'NR==FNR{a[$1];next}{if (a[$2])print a[$1],$0;else print "Not Found", $0;}'  tmp list

the output as below:

Not Found cde,file4
Not Found cde,file5
Not Found def,file6
Not Found def,file7
Not Found def,file8
Not Found abc,file1
Not Found abc,file2
Not Found abc,file3
Not Found acd,file9
Not Found acd,file10

Hope the experts in the forum take a look, tell me where I did wrong and how to fix this.

Thanks in advance

Replace if (a[$2]) with if ($2 in a)

1 Like

Correct your code like this

awk -F, 'FNR==NR{_[$1];next}{print ($2 in _)?$0:"Not Found" OFS $0}' tmp list
cde,file4
cde,file5
def,file6
Not Found def,file7
Not Found def,file8
abc,file1
abc,file2
abc,file3
Not Found acd,file9
Not Found acd,file10
1 Like

Thanks for you replay.

I use below code I got error:

awk -F"," 'NR==FNR{a[$1];next}{if ($2 in a) print a[$1],$0;else print "Not Found", $0;}'  tmplist srclist
awk: syntax error near line 1
awk: illegal statement near line 1
awk: illegal statement near line 1

My server is SunOS 5.10 Generic_144488-17 sun4v sparc SUNW,SPARC-Enterprise-

would you please take a look.

Thanks in advance

---------- Post updated at 11:56 AM ---------- Previous update was at 11:52 AM ----------

Thanks for your replay.

when using the code you provide, I got error.
awk: syntax error near line 1
awk: illegal statement near line 1
My server is SunOS 5.10 Generic_144488-17 sun4v sparc SUNW,SPARC-Enterprise-

by the way, would you please briefly explain what "_[$1]", "($2 in _)?" mean?

Thanks in advance

_[$1] is an array

? is ternary operator used instead of if

go through ?: - Wikipedia, the free encyclopedia

If you are running this on a Solaris/SunOS system, use /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk instead of awk

---------- Post updated at 11:49 AM ---------- Previous update was at 11:43 AM ----------

it was suppose to be

awk -F","  'NR==FNR{a[$1];next}{if ($2 in a){print $0}else {print "Not Found", $0}}' tmplist srclist
1 Like

---------- Post updated at 01:30 PM ---------- Previous update was at 01:18 PM ----------

Hi Gurus,

I have two files which like below:
srclist;

cde,file4
cde,file5
def,file6
def,file7
def,file8
abc,file1
abc,file2
abc,file3
acd,file9
acd,file10

tmplist

Oct,9,11:42,file1
Oct,9,11:42,file2
Oct,9,11:43,file3
Oct,9,11:42,file4

I need to compare to find filename existing in srclist but not in tmplist.
I use code as below:

nawk -F"," 'NR==FNR{a[$4]=$0;next}{if ($2 in a) print $0,"," "Found";else print $0,"," "Not Found";}'  tmplist srclist

and I got below result:

cde,file4  ,Found
cde,file5 ,Not Found
def,file6 ,Not Found
def,file7 ,Not Found
def,file8 ,Not Found
abc,file1  ,Found
abc,file2  ,Found
abc,file3  ,Found
acd,file9 ,Not Found
acd,file10 ,Not Found

I have one issue, I need print out the first 3 columns from tmplist, I tried below code, but it doesn't work.

nawk -F"," 'NR==FNR{a[$4]=$0;next}{if ($2 in a) print $0,a[$1],"," "Found";else print $0,"," "Not Found";}'  tmplist srclist

for my understanding, a[$1] should be first file first column, but it isn't.
how can I print first file's column in my code?

Any input is really really appreciate.

Thanks in advance

Whether this is your interest ??

$ nawk -F"," 'NR==FNR{a[$4]=$3;next}{if ($2 in a) print $0","a[$2] ",""Found";else print $0",""----""," "Not Found";}' tmplist srclist
cde,file4,11:42,Found
cde,file5,----,Not Found
def,file6,----,Not Found
def,file7,----,Not Found
def,file8,----,Not Found
abc,file1,11:42,Found
abc,file2,11:42,Found
abc,file3,11:43,Found
acd,file9,----,Not Found
acd,file10,----,Not Found
1 Like

Thanks for your replay,
the result are pretty close to my expection. I need get the date partion as well

cde,file4,Oct,9,11:42,Found

one more question, what's the a[$2] mean in below command?
print $0","a[$2] ",""Found"

a[$2] contains what you defined in first block in present case
'NR==FNR{a[$4]=$0;next} it contains row ($0) and you are matching with $2 of second file

then use this

awk -F"," 'NR==FNR{a[$4]=$0;next}{if ($2 in a) print $0","a[$2] ",""Found";else print $0",""---------""," "Not Found";}' tmplist srclist
cde,file4,Oct,9,11:42,file4,Found
cde,file5,---------,Not Found
def,file6,---------,Not Found
def,file7,---------,Not Found
def,file8,---------,Not Found
abc,file1,Oct,9,11:42,file1,Found
abc,file2,Oct,9,11:42,file2,Found
abc,file3,Oct,9,11:43,file3,Found
acd,file9,---------,Not Found
acd,file10,---------,Not Found
1 Like

Thank you very much.
there is one more column in result:

abc,file1,Oct,9,11:42,file1,Found

I only need date time as below

abc,file1,Oct,9,11:42,Found

Thanks in advance. you are great
:b:

use this code

$ awk -F"," 'NR==FNR{a[$4]=$0;next}{print ($2 in a)?$0 FS sprintf("%s",substr( a[$2],0,length(a[$2])-length($2)-1)) FS "Found" :$0 FS "Not Found"}'  tmplist srclist
cde,file4,Oct,9,11:42,Found
cde,file5,Not Found
def,file6,Not Found
def,file7,Not Found
def,file8,Not Found
abc,file1,Oct,9,11:42,Found
abc,file2,Oct,9,11:42,Found
abc,file3,Oct,9,11:43,Found
acd,file9,Not Found
acd,file10,Not Found

This may be what you want:

awk -F"," 'NR==FNR {a[$4]=$1FS$2FS$3;next}{printf "%s,", $0; if (a[$2]) printf "%s ", a[$2]; else printf "not "; print "found"}'  tmpfile srclist
cde,file4,Oct,9,11:42 found
cde,file5,not found
def,file6,not found
def,file7,not found
def,file8,not found
abc,file1,Oct,9,11:42 found
abc,file2,Oct,9,11:42 found
abc,file3,Oct,9,11:43 found
acd,file9,not found
acd,file10,not found

EDIT: or even

awk -F"," 'NR==FNR {a[$4]=$1FS$2FS$3;next}{printf "%s,%s %s\n", $0, a[$2]?a[$2]:"not", "found"}'  tmplist srclist
1 Like

Thanks for both of you to add such valuable input for me.
One more question:
I need to get the group which all files are found in this group. the group identified by first column (cde, abc). in above result I want to get result as below:

abc,file1,Oct,9,11:42 found
abc,file2,Oct,9,11:42 found
abc,file3,Oct,9,11:43 found

for group "cde", there is one file missing, so I don't want get it.

thanks in advance.
:b:

---------- Post updated at 05:18 PM ---------- Previous update was at 04:08 PM ----------

I found below code in this forum, I think this will help to achieve my goal. I have some question for this code:

awk -F , '! a[$1 FS $2] {b[$1]++;a[$1 FS $2]++}END {for (i in b) print i","b}' infile
  1. what' "! a[$1 FS $2] " mean?
  2. if I want to print out the entire record, what I should change?
abc,1
cde,2

I want to get;

abc,Oct,9,11:42, 1
cde,Oct,9,11:42, 2

Thanks in advance.