comparing two files using awk.

Hi All,

a new bie to awk,

How to compare substring of col1,file 1 with
col2file2 and get file1contents+col3file2 as output.

file1
-----
kumarfghh,23,12000,5000
rajakumar,24,14000,2500
rajeshchauhan,25,16000,2600
manoj,26,17000,2300

file 2
--------
123,kumar,US,
123,sukumar,UK
123,raj,Germany
40,rajesh,Australia
40,jerome,swiss
40,rakesh,india

output
-----------------------
kumarfghh,23,12000,5000,kumar,US
rajakumar,24,14000,2500,raj,Germany
rajeshchauhan,25,16000,2600,rajesh,Australia
manoj,26,17000,2300,,

Note:so if nothing matched i should get a default value as null.

Please help me,as i have been trying very hard to acheive

Try this awk program called "eg.awk" ...

        BEGIN {
                OFS = FS = ","
        }

        NR == FNR       {
                b[$2] = $3
                next
        }

        {
                e = ""
                for (x in b) {
                        if (match($1, x)) {
                                if (RSTART == 1 && RLENGTH > length(e)) {
                                        e = x
                                }
                        }
                }
                print $0, e, b[e]
        }

...run it like this...

awk -f eg.awk file2 file1

...which gives...

kumarfghh,23,12000,5000,kumar,US
rajakumar,24,14000,2500,raj,Germany
rajeshchauhan,25,16000,2600,rajesh,Australia
manoj,26,17000,2300,,

Thanks a lot ygor
great script.works fine
As iam new bie to awk,can u explain me ygor.

Thanks and Regards,
sukumar

Which part don't you understand?

hi ygor,

I have understood the script,but one more issue is that the script is taking more time if my file 1 and file2 exceeds more than 5000 records,

As we are taking file2 in array,what will be limitation of the array.(bcoz in future if my file 2 increased,tats why).

Can u give some tips how to improve the performance.

Regards,
Sukumar.