Performance issue with awk script.

Hi,

The below awk script is taking about 1 hour to fetch just 11 records(columns). There are about 48000 records. The script file name is take_first_uniq.sh

#!/bin/ksh  

if [ $# -eq 2 ] 
then  

while read line 
do 
first=`echo $line | awk -F"|" '{print $1$2$3}'`
while read line2
do
second=`echo $line2 | awk -F"|" '{print $7$13$14}'`
if [ ${first} == ${second} ] 
then 
echo $line2
fi 
done < $2

done < $1  
fi 

I call this script this way..

ksh take_first_uniq.sh file_3uniq_fields.out file_sort_all_fields.out > file_uniq_master.out 

Please suggest me how to increase the performance.. I'm new to awk scripting.

Thanks,
RRVARMA

Try something like the following (which is untested since you did not post sample of your data files).

#!/bin/ksh

[[ $# != 2 ]] && exit 1

IFS="|"
while read v1 v2 v3 rest
do
    first="${v1}${v2}${v3}"
    while read v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 rest
    do
        [[ ${first} == "${v7}${v13}${v14}" ]]  && print $v1 $v2 $v3 $v4 $v5 $v6 $v7 $v8 $v9 $v10 $v11 $v12 $v13 $v14 $rest
    done < $2
done < $1

exit 0

This is not an AWK script, this is a shell script that includes some AWK code.

If you post sample input and the desired output we could try to help ...

Hi fpmurphy & radoulov,

Thanks for the feed back.

These are the sample records for first file file_3uniq_fields.out

1TVAO|OVEPT|VO
1TVAO|OVPDM|VO
6NFXE|17CLP|DH
6NFXE|NRZO4|EQ
6NFXE|SMOSA|EQ
ACA15|11X1W|DX
ACA15|1LN88|DX
ACA15|1LNSK|DX
ACA15|1LNVX|DX
ACA15|1LNVX|FD

and here are the sample records for second file.. file_sort_all_fields.out

1TVAO|S3zS033306|4577777770|4513201000|AJBFGJ|CB10|1TVAO|S3WS033306|4513101000|4513201000|AJBFGJ|CB10|OVEPT|VO|430300|430300|430300|009|IC    |Z|N|Y|IS
1TVAO|S3zS033306|4515685200|4513201000|AJBFGJ|CB10|1TVAO|S3WS033306|4513101000|4513201000|AJBFGJ|CB10|OVPDM|VO|430300|430300|430300|009|IC    |Z|N|Y|IS
6NFXE|S3Sr021401|4522451000|4511201000|B7BXHT|CB10|6NFXE|S3SN021401|4511101000|4511201000|B7BXHT|CB10|17CLP|DH|******|6670NI|410402|011|LQ    |Z|A|Y|IS
AGRJE|NA|NA|NA|NA|NA|6NFXE|S3SN021401|4511101000|4511201000|B7BXHT|CB10|NRZO4|EQ|402100|6670DC|410402|001|EQ|Z|U|Y|VT
6NFXE|S3Sz021401|4522201000|4511201000|B7BXHT|CB10|6NFXE|S3SN021401|4511101000|4511201000|B7BXHT|CB10|SMOSA|EQ|******|6670NI|410402|016|EQ    |Z|U|Y|IS
ACA15|S3Bz100120|4522201000|4511201000|AEBDHZ|CB10|ACA15|S3BW100120|4511101000|4511201000|AEBDHZ|CB10|11X1W|DX|410312|410312|410312|011|LQ    |Z|A|Y|IS
ACA15|S3BW100120|4512541000|4511201000|AEBDHZ|CB10|ACA15|S3BW100120|4511101000|4511201000|AEBDHZ|CB10|1LN88|DX|410312|410312|410312|A14|IOC   |Z|N|Y|IS
ARCXE|NA|NA|NA|NA|NA|ACA15|S3BW200120|4511101000|4511201000|AEBDHZ|CB10|1LN88|DX|410312|420100|420100|A14|IOC   |Z|N|Y|IS
ACA15|NA|NA|NA|NA|NA|ACA15|NA|NA|NA|NA|NA|1LNSK|DX|410312|410312|410312|A14|TC    |Z|N|Y|IS
ACA15|NA|NA|NA|NA|NA|ACA15|NA|NA|NA|NA|NA|1LNVX|DX|410312|410312|410312|009|IOC   |Z|N|Y|IS

Thanks,
RRVARMA

... and how the desired output (file_uniq_master.out) should look like?