Copy values from columns matching in those in second file.

shoaibjameel123 · September 10, 2011, 11:25am

Hi All,

I have two sets of files.

Set 1: 100 text files with extension .txt with names like 1.txt, 2.txt, 3.txt until 100.txt

Set 2: One big file with extension .dat

The text files have some records in columns like this:

0.7316431 82628
0.7248189 82577
0.7248182 81369
0.7222999 83490
0.71819735 81613
0.7173147 83027
0.7161552 83105
0.7161268 83822
0.7161214 80988
0.7157952 83798
0.7155143 81649
0.7151216 83717

All the .TXT files look the same but with different numbers.

My big .DAT file looks like this:

0.047589 11
0.021992 12
0.029547 13
0.030269 14
0.022525 15
0.021238 16
0.023595 17
0.028851 18
0.731 82628
0.724 82577
0.724 81369
0.72 83490
0.79 81613
0.77 83027
0.74 83105
0.73 83822
0.714 80988
0.7952 83798
0.743 81649
0.7216 83717

Now, the important observation in the two files is that numbers in column 2 in 100 .TXT files are sure to be found in column 2 of .DAT file.

I want to:

Read the .TXT files one by one.
Extract the first column from .DAT when the number in the second column from .TXT matches with that in the .DAT file and write the result like this in 1.res file considering that I am reading 1.txt file and same goes for 2.txt where I create 2.res file until 100.res:

0.7316431 0.731 
0.7248189 0.724 
0.7248182 0.724 
0.7222999 0.72 
0.71819735 0.79 
0.7173147 0.77 
0.7161552 0.74 
0.7161268 0.73 
0.7161214 0.714 
0.7157952 0.7952 
0.7155143 0.743 
0.7151216 0.7216

As you can see above that the numbers in the second column of .TXT files that matched with the numbers in the second column in the big file (.DAT), I have extracted only the numbers from the first column of the matching second column numbers and wrote the result in 1.res file.

I am using Linux with BASH.

I've been trying with some codes that I could get in this forum but after failing turned back here. This is what I wrote but it does not do anything expect return blank.

for txt in *.txt
do
  num=`echo $txt | cut -f1 -d"."`
  awk 'NR==FNR{a[$1]=$1;next}{print a[$0]}' big_file.dat $num.txt >> $num.res
done

yazu · September 10, 2011, 12:10pm

Try:

for txt in *.txt; do                                                         
  awk '
     NR == FNR { a[$2]=$1 }
     NR != FNR { print $1, a[$2] }
  ' big_file.dat $txt >${txt%txt}res # bash/ksh/zsh
done