Comparison of two files in awk

Hi,
I have two files file1 and file2 delimited by semicolon,
And I want to compare column 2 and column3 of file1 to column3 and column 4 in file2.

file1
--------
abc;cef;155.67;143_34;
def;fgh;146.55;123.3;
frg;hff;134.67;;
yyy;fgh;134.78;35_45;

file 2
---------
abc;cef;155.09;;
abc;cef;155.67;143_34;
asd;;;123;
def;fgh;145.6;123.3;
def;fgh;146.55;123.3;
frg;hff;134.67;;

Successfile1
------------
abc;cef;155.67;143_34;
def;fgh;146.55;123.3;

Failfile1
-----------
frg;hff;134.67;;
yyy;fgh;134.78;35_45;

Can anyone help me with a script.

Hi Jerome

First of all wht is see is col2 of file 1 is text & col3 of file2 is number,
so how u r going to compare?

but still you can use somewhat this way

#!/usr/bin/ksh

cut -d";" -f2 file1 >> tmpf2.txt
echo
cut -d";" -f3 file2 >> tmpf3.txt

diff tmpf2.txt tmpf3.txt

cut -d";" -f3 file1 >> tmpf3.txt
echo
cut -d";" -f4 file2 >> tmpf4.txt

diff tmpf3.txt tmpf4.txt

rm tmpf[0-9].txt

:smiley:

Sorry girish,

I have given the column info wrongly,
Its col3 and col4 of file1 to col3 and col4 of file2 comparison.

Perhaps this is what you want, but I'm not sure if I've understood you :slight_smile:

#!/bin/bash

comp1=($(cat text1.txt | cut -d\; -f 3,4))
comp2=($(cat text2.txt | cut -d\; -f 3,4))

for str in ${comp1[*]}; do
   i=0
   while (( $i < ${#comp2[*]} )); do
      if [[ $str = ${comp2} ]]; then
         cat text1.txt | grep $str
      fi
      (( i += 1 ))
   done
done

Regards.

Hi Grial,
Thanks for your prompt and quick response.

The script works for comparing two cols i.e., col 3 and col4 of two files.

If i try to try to compare only col3 of two files,
I am getting redundant records.

Eg:
My File1 consists of 100 records and
file2 consists of 238 records.If i try to compare,file1 and file2 I got 116 records as my o/p
in the console.Can u suggest me,how to rectify this.

Again, I don't know if I've understood. Do you mean you could have duplicate records on file2? Or, Do you want only the first ocurrence? If this is teh case, try:

#!/bin/bash

comp1=($(cat text1.txt | cut -d\; -f 3,4))
comp2=($(cat text2.txt | cut -d\; -f 3,4))

for str in ${comp1[*]}; do
   i=0
   while (( $i < ${#comp2[*]} )); do
      if [[ $str = ${comp2} ]]; then
         cat text1.txt | grep $str
         break
      fi
      (( i += 1 ))
   done
done

Hi Grial,
Again thanx for ur kind repsonse,Let me explain clearly.
I have compared col3 of file1 and col3 of file2.
I got duplicates of file1 with the latest script send by you.
And one more thing is that,i will not be getting any duplicate records for both the files.
Just i want to check columns/column of file1 with file2.

mmmmmmm... Still not clear for me.... Let me see If now I understand.

  • You want to check one-to-one, or
  • two-to-two?
    Please, give me another example to make it clear :slight_smile:

Yah....
Its a one to one mapping between the files..

file1
-------
a;a;c;
d;f;g;
3;7;8;

file2
------
4;7;8;
3;4;7
a;a;c;
d;f;g;

success file1
-----------
a;a;c;
d;f;g;

fail file1
--------
3;7;8;

I want to get success and fail records of file1 in different file..

I dont need any information in file2.(You can take it as an mapping file like..)

OK.
col3-to-col3 always or col3-to-colX or colX-to-colY?
In your example you are comparing whole lines with whole lines...
mmmmm:

#!/bin/bash

> success.txt
> fail.txt
comp1=($(cat text1.txt))
comp2=($(cat text2.txt))

for str in ${comp1[*]}; do
   i=0
   FOUND=no
   while (( $i < ${#comp2[*]} )); do
      if [[ $str = ${comp2} ]]; then
         cat text1.txt | grep $str >> success.txt
         FOUND=yes
         break
      fi
      (( i += 1 ))
   done
   if [[ $FOUND = no ]]; then
      cat text1.txt | grep $str >> fail.txt
   fi
done

Which compares whole lines...

Hi,

As i have said its a one-one mapping,but the columns can be dynamic.
It can be col X(file1)-col 1(file2) ,colX,colY(file1)-col1col2(file2).Is it possible with the current script.

while read line;do
first=`echo $line | cut -d ";" -f1`
third=`echo $line | cut -d ";" -f3`
while read var; do
result=`echo $var | awk -F";" -v first=${first} -v third=${third} '{if($1~first && $3~third) print 1; else print 0}'`
if [[ result -eq 1 ]]; then
break
fi
done < file2

if [[ $result -eq 1 ]]; then
echo $line >> found
else
echo $line >> notfound
fi
done < file1

Hi, Sukumar
This is simple solution for your problem.

#!/bin/bash

while read var1
do
colf1=` echo $var1 | cut -d";" -f 3,4`
grep $colf1 file2 >> sucessfile1
done < file1
grep -vf sucessfile1 file1 >> failfile1