Comparing 2 files and return the unique lines in first file

Hi,

I have 2 files

file1
********
01-05-09|java.xls|
02-05-08|c.txt|
08-01-09|perl.txt|
01-01-09|oracle.txt|
********

file2
********
01-02-09|windows.xls|
02-05-08|c.txt|
01-05-09|java.xls|
08-02-09|perl.txt|
01-01-09|oracle.txt|
********

I have to compare these 2 files and return the different lines lines in file1.

Output Expected
*****
08-01-09|perl.txt|
*****

I have tried with diff & comm commands but nothing is giving exact output.

Thanks

awk 'NR==FNR{a[$0];next}!($0 in a)' file2 file1

Regards

Neither 08-02-09|perl.txt| nor 02-05-08|c.txt nor 01-02-09|windows.xls| are in file1, so what don't you want those other two as output?

Or do you want to find lines in file1 where the 2nd field matches those in file2 but the others do not? In that case you could so something like:

  sort file2 >file2.sorted
  # 1. Find all lines in file1 that match the second field of file2
  awk -F\| '{ print $2 }' file2 |sort -u |
  grep -f- -F file1 | 
  # print lines only in file1
  comm -1 -3 file2.sorted -

Hi franklin,

Its giving sysntax error.

awk: syntax error near line 1
awk: bailing out near line 1

Thanks.

Use nawk, gawk or /usr/xpg4/bin/awk on Solaris.

Regards

hi franklin, care to explain what the awk statement means? thanks

NR==FNR

If we read the first file.

{a[$0];next}

Define an array with the name a with the index $0 and read the next line.

Action for the second file:

!($0 in a)

Print the line if a[$0] is not defined.

Regards

that makes sense.. thanks

I do have a same kind of problem.... i have 2 files time.out and finaltime.out
finaltime.out

.
.
.
20090124,00:02:26,00:04:19,86752,00:01:37,00:02:50,01:07:52,150818, ,
20090125,00:02:31,00:00:11,2776,00:01:38,00:01:31,00:56:36,108938, ,
20090126,00:02:33,00:00:35,8187,00:01:42,00:01:32,02:02:08,321055, ,
20090127,00:02:33,00:03:04,62153,00:01:39,00:02:25,01:39:55,266355, ,

time.out

20090125,00:02:36,00:04:25,61615,00:01:45,00:02:24,01:34:25,257607, ,

i want time.out to be compared to finaltime.out such that if col1 in both files is same then it have to be replace with data of time.out in finaltime.out... if the data of time.out is not present in final time.out then at the end of finaltime.out it have to write in a new line... can someone say me how to do this

Thanks....