Problems with awk (fatal error) and paste (two variables into one column-by-column)

vgbraymond · October 26, 2014, 11:08pm

Hello, I have a script extracting columns of useful numbers from a data file, and manipulating the numbers with awk commands. I have problems with my script...

There are two lines assigning numbers to $BaseForAveraging. If I use the commented line (the first one) and let the second one commented, it will affect a later awk command because I see the following error:

awk: cmd. line:1: fatal: cannot open file `{averaged=$3 / BaseForAveraging
                printf "%.10f \t %.14f \n",$2,averaged}' for reading (No such file or directory)

However, if I use the second line instead, leaving the first one commented, I will not see the above error.
For either of the above two lines, I see the same number - 1000 when I echo $BaseForAveraging .

My question is... what's wrong with my first line?

In a line assigning value to $Data, I have the following error:

extract_direct_signal.sh: command substitution: line 23: syntax error near unexpected token `('
extract_direct_signal.sh: command substitution: line 23: `paste <(echo "$Data") <(echo "$Temp"))'

My question is ... what's the problem with the command?

I have included a data with which the script works with, so you might try to run it if it is useful to do so...
I have many commented lines at the end because I want to keep them away before getting the above bugs cleared...
I hope you could help me. Thank you very much for your time...

Raymond

===My code===

#!/bin/bash

#PWLFiles=($(ls $1/*.pwl))	#list the PWL file
PWLFile=SampleDataFile.pwl.txt

#for PWLFile in $PWLFiles
#do

	LineNumbers=$(grep -hn "Direct signal" $PWLFile | cut -d ":" -f1)	#Extract the Line numbers in which the word "Direct signal" is contained
	Data=

	for LineNumber in $LineNumbers
	do	
		echo $LineNumber

#		BaseForAveraging=$(sed -n "6p" $PWLFile | sed 's/Number of signal records:  //g')		
		BaseForAveraging=1000

		Temp=$(sed -n "$((LineNumber + 9)),+999p" $PWLFile | sed 's/ (//g' | awk -v BaseForAveraging=${BaseForAveraging} '{averaged=$3 / BaseForAveraging
		printf "%.10f \t %.14f \n",$2,averaged}')

		Data=$(paste <(echo "$Data") <(echo "$Temp"))

		echo "$Data"
	done

#	echo "$Data" | awk '{for (j=2;j<=NF;j=j+6)
#	{
#		for (i=j;i<j+6;i=i+2) total=total+$i
#	}
#	printf "%.10f \t %.14f \n", $1, total}{total=0}
#	for (num in total) printf "%.14f \t", num
#	delete total
#	printf "\n"' #> $PWLFile.signal
		

#	echo $PWLFile
#done

neutronscott · October 27, 2014, 1:21am

You need to quote every variable expansion. Likely your $BaseForAveraging ends up with whitespace in it and awk sees that as the program to execute and the next parameter (your actual awk code) as a file to open to reading.

There are a lot of bad practices going on here. Please give us sample input and sample output and you can likely do this a lot cleaner with just 1 invocation of awk.

vgbraymond · October 27, 2014, 2:37am

Hi Neutronscott,
Thanks for your reply.

I have a data file, in which there are 48 repetitions of the following block of data.
In each block, there are 2 parts: "Direct signal" and "Cross-talk" (I have bolded them).
I then extract two sets of information (I underlined them) under only the "Direct signal" section.

Number of signal records
Data (1st column: time // 2nd column: data value)

Data consists of two columns. I divide each value in the second column by the number of signal records.
Therefore, for each block, I will get 2 columns of data. It is known that the 1st column (time) of each block is the same.
The last step is to add the 2nd columns of every three consecutive blocks up, and tabulate the result. I will get 48/3 = 16 "added" columns. I will finally insert a first column back.
This is what I am trying to implement.

By quoting the variables, do I have to replace every $var with "$var"?

Input:

% Created 16/09/14 At 16.04.46 < none > SIGNAL   "Direct signal, group   1     "

  Group 1 consists of:
     Wire 1 with label W at (x,y)=(0,0) and at 1600 V
 Number of signal records:  1000
 Units used: time in second, current in Ampere.
 .STIMULUS signal PWL
 + TIME_SCALE_FACTOR =  0.100E-11
 + VALUE_SCALE_FACTOR =  0.100E-11
 + (  0.00000000E+00   0.00000000E+00
 +     0.30000002E-09   0.00000000E+00
### 996 more rows of data here
 +     0.29940000E-06   0.00000000E+00
 +     0.29970002E-06   0.00000000E+00 )
% Created 16/09/14 At 16.04.46 < none > SIGNAL   "Cross-talk, group   1        "

  Group 1 consists of:
     Wire 1 with label W at (x,y)=(0,0) and at 1600 V
 Number of signal records:  1000
 Units used: time in second, current in Ampere.
 .STIMULUS signal PWL
 + TIME_SCALE_FACTOR =  0.100E-11
 + VALUE_SCALE_FACTOR =  0.100E-11
 + (  0.00000000E+00   0.00000000E+00
 +     0.30000002E-09   0.00000000E+00
### 996 more rows of data here
 +     0.29940000E-06   0.00000000E+00
 +     0.29970002E-06   0.00000000E+00 )

Output:

0.00000000E+00   0.00000000000E+00 ###14 more columns here  0.00000000000E+00 
0.30000002E+00   0.00000000001E+00 ###14 more columns here  0.00000000001E+00 
### 996 more rows of data here
0.29940000E+00   0.00000000005E+00 ###14 more columns here  0.00000000005E+00 
0.29970002E+00   0.00000000002E+00 ###14 more columns here  0.00000000002E+00

RudiC · October 27, 2014, 4:57am

You could try this to get at the (two in your sample file) data columns (unfortunately all zeroes in your sample file); I didn't quite understand what averaging you wanted to achieve, so I leave it up to you. Give it a try:

awk     '/Direct signal/        {L=1; RCnt++}
         /Cross-talk/           {L=0}
         /Number of/            {NoS=$NF}
         L && / \+ \(/          {SoL=NR; EoL=NR+NoS-1}
         NR<=EoL                {sub (/\)/,_); TM[NR-SoL]=$(NF-1); DT[NR-SoL,RCnt]=$NF/NoS}
         END                    {for (i=0; i<NoS; i++)
                                        {printf "%14.8E", TM
                                         for (j=1; j<=RCnt; j++) printf "\t%14.8E", DT[i,j]
                                         printf "\n"
                                        }
                                }
        ' /tmp/SampleDataFile.pwl.txt
0.00000000E+00  0.00000000E+00  0.00000000E+00
3.00000020E-10  0.00000000E+00  0.00000000E+00
6.00000050E-10  0.00000000E+00  0.00000000E+00
9.00000020E-10  0.00000000E+00  0.00000000E+00
1.20000010E-09  0.00000000E+00  0.00000000E+00
1.50000000E-09  0.00000000E+00  0.00000000E+00
1.80000000E-09  0.00000000E+00  0.00000000E+00
2.10000020E-09  0.00000000E+00  0.00000000E+00
2.40000020E-09  0.00000000E+00  0.00000000E+00
2.70000000E-09  0.00000000E+00  0.00000000E+00
3.00000000E-09  0.00000000E+00  0.00000000E+00
3.30000030E-09  0.00000000E+00  0.00000000E+00
. . .

vgbraymond · October 28, 2014, 1:56am

rudic:

You could try this to get at the (two in your sample file) data columns (unfortunately all zeroes in your sample file); I didn't quite understand what averaging you wanted to achieve, so I leave it up to you. Give it a try:

awk     '/Direct signal/        {L=1; RCnt++}
   /Cross-talk/           {L=0}
   /Number of/            {NoS=$NF}
   L && / \+ \(/          {SoL=NR; EoL=NR+NoS-1}
   NR<=EoL                {sub (/\)/,_); TM[NR-SoL]=$(NF-1); DT[NR-SoL,RCnt]=$NF/NoS}
   END                    {for (i=0; i<NoS; i++)
   {printf "%14.8E", TM
   for (j=1; j<=RCnt; j++) printf "\t%14.8E", DT[i,j]
   printf "\n"
   }
   }
   ' /tmp/SampleDataFile.pwl.txt
0.00000000E+00  0.00000000E+00  0.00000000E+00
3.00000020E-10  0.00000000E+00  0.00000000E+00
6.00000050E-10  0.00000000E+00  0.00000000E+00
9.00000020E-10  0.00000000E+00  0.00000000E+00
1.20000010E-09  0.00000000E+00  0.00000000E+00
1.50000000E-09  0.00000000E+00  0.00000000E+00
1.80000000E-09  0.00000000E+00  0.00000000E+00
2.10000020E-09  0.00000000E+00  0.00000000E+00
2.40000020E-09  0.00000000E+00  0.00000000E+00
2.70000000E-09  0.00000000E+00  0.00000000E+00
3.00000000E-09  0.00000000E+00  0.00000000E+00
3.30000030E-09  0.00000000E+00  0.00000000E+00
. . .

Hi RudiC,

Thanks for your reply! Your code is far neater than mine. I have to spend some time on understanding it first.

Thanks,
Raymond

vgbraymond · November 19, 2014, 9:12pm

Thank you so much for offering the following code. That is so neat and useful for me. I have understood it with some online tutorial. awk is great.

Raymond

rudic:

You could try this to get at the (two in your sample file) data columns (unfortunately all zeroes in your sample file); I didn't quite understand what averaging you wanted to achieve, so I leave it up to you. Give it a try:

awk     '/Direct signal/        {L=1; RCnt++}
   /Cross-talk/           {L=0}
   /Number of/            {NoS=$NF}
   L && / \+ \(/          {SoL=NR; EoL=NR+NoS-1}
   NR<=EoL                {sub (/\)/,_); TM[NR-SoL]=$(NF-1); DT[NR-SoL,RCnt]=$NF/NoS}
   END                    {for (i=0; i<NoS; i++)
   {printf "%14.8E", TM
   for (j=1; j<=RCnt; j++) printf "\t%14.8E", DT[i,j]
   printf "\n"
   }
   }
   ' /tmp/SampleDataFile.pwl.txt
0.00000000E+00  0.00000000E+00  0.00000000E+00
3.00000020E-10  0.00000000E+00  0.00000000E+00
6.00000050E-10  0.00000000E+00  0.00000000E+00
9.00000020E-10  0.00000000E+00  0.00000000E+00
1.20000010E-09  0.00000000E+00  0.00000000E+00
1.50000000E-09  0.00000000E+00  0.00000000E+00
1.80000000E-09  0.00000000E+00  0.00000000E+00
2.10000020E-09  0.00000000E+00  0.00000000E+00
2.40000020E-09  0.00000000E+00  0.00000000E+00
2.70000000E-09  0.00000000E+00  0.00000000E+00
3.00000000E-09  0.00000000E+00  0.00000000E+00
3.30000030E-09  0.00000000E+00  0.00000000E+00
. . .

Chubler_XL · November 19, 2014, 10:24pm

Didn't the original description call for 17 columns of output:

0.00000000E+00   0.00000000000E+00 ###14 more columns here  0.00000000000E+00 
0.30000002E+00   0.00000000001E+00 ###14 more columns here  0.00000000001E+00

I can't see how the above awk code is doing what you describe.

vgbraymond · November 19, 2014, 10:43pm

I think the following line does.
As awk scans through the lines, RCnt will increase up to 17 finally.

for (j=1; j<=RCnt; j++) printf "\t%14.8E", DT[i,j]

chubler_xl:

Didn't the original description call for 17 columns of output:
0.00000000E+00   0.00000000000E+00 ###14 more columns here  0.00000000000E+00 
0.30000002E+00   0.00000000001E+00 ###14 more columns here  0.00000000001E+00 
I can't see how the above awk code is doing what you describe.

Chubler_XL · November 19, 2014, 11:37pm

OK I see now it may need some tweaking to do the 3 consecutive blocks summing, I believe this will currently display all 48 blocks.

vgbraymond · November 19, 2014, 11:42pm

Oh... you are right...