awk processing of variable number of fields data file

Hy!

I need to post-process some data files which have variable (and periodic) number of fields. For example, I need to square (data -> data*data) the folowing data file:

 -5.34281E-28 -3.69822E-29  8.19128E-29  9.55444E-29  8.16494E-29  6.23125E-29
  4.42106E-29  2.94592E-29  1.84841E-29  1.09271E-29  6.08599E-30  3.19287E-30
  1.57732E-30  7.33449E-31  3.20866E-31  1.31982E-31  5.10059E-32  1.85021E-32
  6.29190E-33  2.00262E-33  5.95280E-34  1.64748E-34
 -5.34281E-28 -3.69822E-29  8.19128E-29  9.55444E-29  8.16494E-29  6.23125E-29
  4.42106E-29  2.94592E-29  1.84841E-29  1.09271E-29  6.08599E-30  3.19287E-30
  1.57732E-30  7.33449E-31  3.20866E-31  1.31982E-31  5.10059E-32  1.85021E-32
  6.29190E-33  2.00262E-33  5.95280E-34  1.64748E-34
 -5.34281E-28 -3.69822E-29  8.19128E-29  9.55444E-29  8.16494E-29  6.23125E-29
  4.42106E-29  2.94592E-29  1.84841E-29  1.09271E-29  6.08599E-30  3.19287E-30
  1.57732E-30  7.33449E-31  3.20866E-31  1.31982E-31  5.10059E-32  1.85021E-32
  6.29190E-33  2.00262E-33  5.95280E-34  1.64748E-34

The data is fitted into 6 columns, but depending on the requested precission, one might found he/she has also some lines with less number of columns.

For the processing, some newbie knowledge of awk suffice. In my case I use something like

awk '{printf("%12.8G %12.8G %12.8G %12.8G %12.8G %12.8G\n", $1*$1, $2*$2, $3*$3, $4*$4, $5*$5, $6*$6)}' initial.data > final.data

which produces something like this

2.8545619E-55 1.3676831E-57 6.7097068E-57 9.1287324E-57 6.6666245E-57 3.8828477E-57
1.9545772E-57 8.6784446E-58 3.4166195E-58 1.1940151E-58 3.7039274E-59 1.0194419E-59
2.4879384E-60 5.3794744E-61 1.0295499E-61 1.7419248E-62 2.6016018E-63 3.423277E-64
3.9588006E-65 4.0104869E-66 3.5435828E-67 2.7141904E-68            0            0 

Question #1: how can one eliminate the "0"s which awk produces? I've tried sed

sed 's/        0    /             /g' <final.data >almost.final.data

but I can't remove the last 0 from each smaller line (i.e. fewer columns with real data); in this case I obtain something like this:

2.8545619E-55 1.3676831E-57 6.7097068E-57 9.1287324E-57 6.6666245E-57 3.8828477E-57
1.9545772E-57 8.6784446E-58 3.4166195E-58 1.1940151E-58 3.7039274E-59 1.0194419E-59
2.4879384E-60 5.3794744E-61 1.0295499E-61 1.7419248E-62 2.6016018E-63 3.423277E-64
3.9588006E-65 4.0104869E-66 3.5435828E-67 2.7141904E-68                         0

Question #2: How can I stop awk process a non existing data from a column? (in my case, the 5th and 6th "fields" from every 4 columns-only lines)

I thank you for your help!

You could check explicitly check for every 4th line:

awk '
NR%4==0 { 4 columns stuff }
NR%4!=0 { 6 columns stuff}'

(EDIT: Or check NF==4/NF==6, as was suggested in a briefly-lived post :))

Or just loop around the number of fields you actually have in each line:

awk '
{
   for (i=1;i<=NF;i++) {
      printf ("%12.8G ", $i*$i)
   }
   printf "\n"
}'

Try also

awk '{print  $1*$1, $2*$2, $3*$3, $4*$4, $5?$5*$5:"", $6?$6*$6:""}' OFMT="%14.8G" file

Yet another possibility:

 awk '{for(i=1; i<=NF; i++) $i*=$i}1' CONVFMT="%14.8G" file

Thank you guys!

I've only implemented Scrutinizers suggestion, and it works well for me.

For CarloM: (I'm not an awk expert, but ...) can you please tell me where should I state in your second code what file should be processed? Thank you!

You can redirect stdin as you did with with your sed command, or just specify the filename as in the other suggestions.