How to use a loop for multiple files in a folder to run awk command?

sajmar · April 9, 2015, 12:31pm

Dear folks

I have two data set which there names are "final.map" and "1.geno" and look like this structures:
final.map:

gi|358485511|ref|NC_006088.3| 2044
 
 gi|358485511|ref|NC_006088.3| 2048
 gi|358485511|ref|NC_006088.3| 2187
 gi|358485511|ref|NC_006088.3| 17654
 
 gi|358485511|ref|NC_006088.3| 17666

1.geno:

gi|358485511|ref|NC_006088.3| 2048   G C 0 1 1
 gi|358485511|ref|NC_006088.3| 17654 A G 1 1 2
 
 gi|358485511|ref|NC_006088.3| 17666 A G 0 1 1
 
 gi|358485511|ref|NC_006088.3| 17785 G A 0 1 1
 gi|358485511|ref|NC_006088.3| 30347 G C 1 1 2

Now, I am trying to run this command below:

awk -f example.awk 1.geno final.map > 1.dat

In this command "example.awk" contains the below command:
NR==FNR{a[$1,$2]=$5" "$6" "$7;next}{print $1,$2,a[$1,$2]?a[$1,$2]:"0 0 0"}

the output of the awk command give us "1.dat" which is

gi|358485511|ref|NC_006088.3| 2044   0 0 0
 
gi|358485511|ref|NC_006088.3| 2048   0 1 1
 
 gi|358485511|ref|NC_006088.3| 2187   0 0 0
 
 gi|358485511|ref|NC_006088.3| 17654 1 1 2
 
 gi|358485511|ref|NC_006088.3| 17666 0 1 1

My problem is I have around 300 *.geno files which I want to get *.dat out of the awk command. I am knowing the loop in unix could be helpful but I think I am using the loop wrong in some way.

Could any one give me an idea how to avoid to run awk each time separately and do it at one time by looping?

RudiC · April 9, 2015, 12:45pm

You forgot to mention which OS and shell you are using. With bourne type shells, this might work:

ls *.geno | while read FN; do awk -f example.awk $FN final.map > ${FN/geno/dat}; done

sajmar · April 9, 2015, 1:02pm

Thank you so much Rudic. this command works well.