awk read in file1, gsub in file2, print to file3

msmehaffey · December 8, 2013, 6:38pm

I'm trying to use awk to do the following. I have file1 with many lines, each containing 5 fields describing an individual set. I have file2 which is a template config file with variable space holders to be replaced by the values in file1. I would like to substitute each set of values in file1 with the variable place holders in file2 and then print to a unique file3 for each row in file1:
file1

sample1 300 150 100 23
sample2 320 150 90 20
sample3 340 160 95 21

file2
...

<general>
execute=var1
range=var2
sub=var3
mu=var4
sigma=var5
</general>

...

file3.1
...
<general>

execute=sample1
range=300
sub=150
mu=100
sigma=23

</general>
...

and I would end up with file3.2 and file3.3 as well.

I am close with awk

'NR==FNR{a1=$1; a2=$2; a3=$3; next} {gsub(/var1/,a1); gsub(/var2/,a2); gsub(/var3/,a3)}1' file1 file2

But it only prints out to stdout with the last record values replaced.

Thanks for any help!

Yoda · December 8, 2013, 7:20pm

Here is an awk approach:

awk '
        NR == FNR {
                A[++c] = $0
                next
        }
        {
                f = "file3." ++k
                print A[1] > f
                for ( i = 1; i <= NF; i++ )
                {
                                sub ( /=.*/, "="$i, A[i+1] )
                                print A[i+1] > f
                }
                print A[c] > f
                close(f)
        }
' file2 file1

msmehaffey · December 17, 2013, 2:54pm

Thank you for your quick response and my apologies for taking so long to do the same! Your code works just as advertised for my example, but I didn't give a good explanation. I needed to expand the file2 context. Here that is:

<general>
type=all
execute=var1
num_threads=8
</general>

<detection>
split=1
window=var2
step=var3
</detection>

<filter>
filter=1
order=1
final_score_threshold=0.4
mu_length=var4
sigma_length=var5
</filter>

<colorcode>
grey = 2,2
green = 5,5
</colorcode>

Inserting var1-var5 was the best option I could think of as place holders in the template file, but whatever works.

Thanks again!

Yoda · December 17, 2013, 4:25pm

Please always use code tags for posting code fragments or data samples.

Try this awk program:

awk '
        NR == FNR {
                A[++c] = $0
                next
        }
        {
                for ( i = 1; i <= NF; i++ )
                {
                        while ( ++j <= c )
                        {
                                if ( A[j] ~ /var/ )
                                {
                                        t = A[j]
                                        sub ( /=.*/, "="$i, t )
                                        print t
                                        ++i
                                }
                                else
                                {
                                        print A[j]
                                }
                        }
                }
                j = 0
        }
' file2 file1

msmehaffey · December 17, 2013, 4:36pm

Beautiful, that did the trick! Thanks!!

awk

RudiC · December 18, 2013, 6:30am

Given your var n are found in ascending order in file2, you can avoid scanning through them for every single output line:

awk     'NR==FNR        {T[++MAX]=$0; next}
                        {vi=1; Ln=0; FN="file3."FNR
                         while (++Ln < MAX)
                          if (match (T[Ln], "var"vi)) {
                                  print substr (T[Ln], 1, RSTART-1) $vi > FN
                                  vi++                } 
                             else print T[Ln] > FN 
                         close (FN)
                        }
        ' file2 file1

msmehaffey · January 2, 2014, 3:55pm

Thanks RudiC, works like a charm. Happy New Year!