awk - compare records of 1 file with 3 files

Abhiraj_Singh · February 18, 2014, 8:36am

hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg -

file1     file2      file3      file4
1         1           5          7
2         2           6          9
3
4
5
6
7
8
9

output should contain :

3
4
8

nawk 'NR==FNR{a[$0]++;next} !a[$0] or !a[$0] or !a[$0]' file1 file2 file3 file4

i am using this code but i dont think it is correct.. problem arised when the file which i wanted to compare was huge and nawk ran out of space giving out of space tostring error.. so now i have splitted the file into 3 parts and then comparing.. also can we have different names for arrays used in this script to avoid the similar situation. I am using solaris system..

Yoda · February 18, 2014, 9:29am

I noticed that the logic you used is wrong. You should compare the other way around:

awk 'FILENAME!="file1"{A[$1];next}!($1 in A)' file2 file3 file4 file1

Abhiraj_Singh · February 18, 2014, 11:16am

Thanks yoda.. Jist a query.. Will array size keep on growing for all files or wil it start from beginning when second file is searched??

Akshay_Hegde · February 18, 2014, 11:31am

!a[$0] or !a[$0] or !a[$0] --> if you use these result will be unexpected, and uses more memory as well, Hope you are reading answers yesterday I answered you on the same here

Nawk Problem - nawk out of space in tostring on Akshay Hegde - Shell Programming and Scripting - Unix Linux Forums

Well in Yoda's solution Array will be created only when it reads first file, for remaining files it just checks for array index whether index is available in array or not.