Incomplete replacement/substitution awk

radudownload · April 29, 2013, 1:44pm

Hi!

(I'm just a newbie in awk & bash scripts) I have a script which replaces one column from an input file with a specified one from the same file. The input and the desired output files are shown below.

~ cat input.file
 random text
 Fe  1.33 23.23 3.33
 C  21.03 23.23 3.33
 Cu  0.00  0.00 0.00
 random text
%block ChemicalSpeciesLabel
 1 26  Fe
 3  6  C
 6 29  Cu
%endblock ChemicalSpeciesLabel
 random text

~ cat output.file
 26  1.33 23.23 3.33
 6  21.03 23.23 3.33
 29  0.00  0.00 0.00

The problem is that my script for the moment produce an "incomplete" replacement (see the first column from the third line)

~ cat output.file
 26  1.33 23.23 3.33
 6  21.03 23.23 3.33
 6u  0.00  0.00 0.00

I found on internet some pieces of code which do this with awk and adapted it for my job.

 awk '{ printf("o%-2s\n",$1) }' input.file > input.file.1

 sed '/%block.*ChemicalSpeciesLabel/I,/endblock.*ChemicalSpeciesLabel/I!d;/ChemicalSpeciesLabel/Id' input.file |awk '{printf("o%-3s %7s\n", $3, $2)}' > rules

 for file in $(ls -1 input.file.1)
  do
    awk 'NR==FNR {a[$1]=$2;next} {for ( i in a) gsub(i,a)}1' rules $file >temp.file
  done

I've modified the first two awks in order to get a supplementary space character, so that the replacement done by awk, would run ok (replace "C " with 3 and "Cu" with 6), but for the moment I still get the incomplete replacement. I think the problem is on the third awk, but I really don't know how to rewrite it.

Thank you!

shamrock · April 29, 2013, 3:54pm

Try this script...

awk '{
   if ($1 ~ "^[A-Z]([a-z])?") {
      f=$1
      $1=""
      x[f]=$0
   }
   if ($NF ~ "^[A-Z]([a-z])?$") {
      l=$NF
      $NF=""
      y[l]=$(NF-1)
   }
} END {for (i in x) print y, x}' inputfile

Don_Cragun · April 29, 2013, 4:07pm

Making a completely different set of wild assumptions about what might be included in "random text", you could try this awk script:

awk ' 
#{printf("FNR=%d, NR=%d, t=%d, $0=%s\n", FNR, NR, t, $0)}
FNR==NR && /^%block/ {
        # Following lines contain translation table.
        t = 1
        next 
}     
FNR==NR && /^%endblock/ {
        # We have found the end of the translation table.
        t = 0
        next
}
t {     # Add entry to tranlsation table.
        tt[$3] = $2
        next
}
FNR!=NR && $1 in tt {
        # This is the second time through the file and we have found a line
        # with its 1st field set to a value in our translation table.
        # Translate the value and print the line.
        # Note that we use sub() here rather than $1 = tt[$1] to keep the
        # original spacing from the input lines.
        sub($1, tt[$1])
        print
}' input.file input.file > output.file

As always, if you're using a Solaris/SunOS system, use /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk instead of awk .