awk changes to make it faster

I have script like below, who is picking number from one file and and searching in another file, and printing output.

Bu is is very slow to be run on huge file.can we modify it with awk

#! /bin/ksh
while read line1
do
echo "$line1"
a=`echo $line1`
if [ $a -ge 0 ]
then
echo "$num"
cat file1|nawk -v "c=$line1" '$1 ~ c' >> message.log
fi
done < file2

.

file2 has below data

cat file2
1234
5678
9100
1324

file 1 has string which contain this data in that.

Can you post some of file 1 and example on how you like the output to be.

first four line should be printed, last two should be ignored


1234,0130020036210801,61400900240,144.135.15.1,50501,8,9550,106A,177200093,144.135.15.212,telstra.internet,mnc001.mcc505.gprs,33,10.237.103.20,0,1,0,0,0,2413,36436,20121002232313,115914,
5678,0124300042019104,61432228629,149.135.133.97,50501,8,6550,2120,80847635,144.135.14.68,telstra.internet,mnc001.mcc505.gprs,33,10.195.135.22,0,1,0,0,0,1962,19782,20121002234954,116855,
9100,0131760091070905,61427989363,149.135.131.65,50501,8,3950,4557,83767434,144.135.14.67,telstra.internet,mnc001.mcc505.gprs,33,100.82.223.99,0,1,0,0,0,235,3324,20121002233018,117271,0,
1324,3524240501157114,61427252411,149.135.133.97,50501,8,A050,0D9E,178201226,144.135.15.212,telstra.internet,mnc001.mcc505.gprs,33,10.239.140.179,0,1,0,0,0,2288,48700,20121002231512,1171
2222,0131760091070905,61427989363,149.135.131.65,50501,8,3950,4557,83767434,144.135.14.67,telstra.internet,mnc001.mcc505.gprs,33,100.82.223.99,0,1,0,0,0,235,3324,20121002233018,117271,0,
2154,3524240501157114,61427252411,149.135.133.97,50501,8,A050,0D9E,178201226,144.135.15.212,telstra.internet,mnc001.mcc505.gprs,33,10.239.140.179,0,1,0,0,0,2288,48700,20121002231512,1171

Try this

awk -F, 'NR==FNR{a[$0];next} $1 in a' file2 file1
1234,0130020036210801,61400900240,144.135.15.1,50501,8,9550,106A,177200093,144.135.15.212,telstra.internet,mnc001.mcc505.gprs,33,10.237.103.20,0,1,0,0,0,2413,36436,20121002232313,115914,
5678,0124300042019104,61432228629,149.135.133.97,50501,8,6550,2120,80847635,144.135.14.68,telstra.internet,mnc001.mcc505.gprs,33,10.195.135.22,0,1,0,0,0,1962,19782,20121002234954,116855,
9100,0131760091070905,61427989363,149.135.131.65,50501,8,3950,4557,83767434,144.135.14.67,telstra.internet,mnc001.mcc505.gprs,33,100.82.223.99,0,1,0,0,0,235,3324,20121002233018,117271,0,
1324,3524240501157114,61427252411,149.135.133.97,50501,8,A050,0D9E,178201226,144.135.15.212,telstra.internet,mnc001.mcc505.gprs,33,10.239.140.179,0,1,0,0,0,2288,48700,20121002231512,1171

Jotne assumes the keys are first in file1, separated by comma.
nawk wants ($1 in a) .

So you say that it should be?

awk -F, 'NR==FNR{a[$0];next} ($1 in a)' file2 file1

Gives same result.

I do understand it like this:
Print lines from file1 if its staring with one of the numbers listed in file2

1 Like

Thank you so much, its working great