Problems with awk (fatal error) and paste (two variables into one column-by-column)

Hello, I have a script extracting columns of useful numbers from a data file, and manipulating the numbers with awk commands. I have problems with my script...

  1. There are two lines assigning numbers to $BaseForAveraging. If I use the commented line (the first one) and let the second one commented, it will affect a later awk command because I see the following error:
awk: cmd. line:1: fatal: cannot open file `{averaged=$3 / BaseForAveraging
                printf "%.10f \t %.14f \n",$2,averaged}' for reading (No such file or directory)

However, if I use the second line instead, leaving the first one commented, I will not see the above error.
For either of the above two lines, I see the same number - 1000 when I echo $BaseForAveraging .

My question is... what's wrong with my first line?

  1. In a line assigning value to $Data, I have the following error:
extract_direct_signal.sh: command substitution: line 23: syntax error near unexpected token `('
extract_direct_signal.sh: command substitution: line 23: `paste <(echo "$Data") <(echo "$Temp"))'

My question is ... what's the problem with the command?

I have included a data with which the script works with, so you might try to run it if it is useful to do so...
I have many commented lines at the end because I want to keep them away before getting the above bugs cleared...
I hope you could help me. Thank you very much for your time...

Raymond

===My code===

#!/bin/bash

#PWLFiles=($(ls $1/*.pwl))	#list the PWL file
PWLFile=SampleDataFile.pwl.txt

#for PWLFile in $PWLFiles
#do

	LineNumbers=$(grep -hn "Direct signal" $PWLFile | cut -d ":" -f1)	#Extract the Line numbers in which the word "Direct signal" is contained
	Data=

	for LineNumber in $LineNumbers
	do	
		echo $LineNumber

#		BaseForAveraging=$(sed -n "6p" $PWLFile | sed 's/Number of signal records:  //g')		
		BaseForAveraging=1000

		Temp=$(sed -n "$((LineNumber + 9)),+999p" $PWLFile | sed 's/ (//g' | awk -v BaseForAveraging=${BaseForAveraging} '{averaged=$3 / BaseForAveraging
		printf "%.10f \t %.14f \n",$2,averaged}')

		Data=$(paste <(echo "$Data") <(echo "$Temp"))

		echo "$Data"
	done

#	echo "$Data" | awk '{for (j=2;j<=NF;j=j+6)
#	{
#		for (i=j;i<j+6;i=i+2) total=total+$i
#	}
#	printf "%.10f \t %.14f \n", $1, total}{total=0}
#	for (num in total) printf "%.14f \t", num
#	delete total
#	printf "\n"' #> $PWLFile.signal
		

#	echo $PWLFile
#done

You need to quote every variable expansion. Likely your $BaseForAveraging ends up with whitespace in it and awk sees that as the program to execute and the next parameter (your actual awk code) as a file to open to reading.

There are a lot of bad practices going on here. Please give us sample input and sample output and you can likely do this a lot cleaner with just 1 invocation of awk.

Hi Neutronscott,
Thanks for your reply.

I have a data file, in which there are 48 repetitions of the following block of data.
In each block, there are 2 parts: "Direct signal" and "Cross-talk" (I have bolded them).
I then extract two sets of information (I underlined them) under only the "Direct signal" section.

  1. Number of signal records
  2. Data (1st column: time // 2nd column: data value)

Data consists of two columns. I divide each value in the second column by the number of signal records.
Therefore, for each block, I will get 2 columns of data. It is known that the 1st column (time) of each block is the same.
The last step is to add the 2nd columns of every three consecutive blocks up, and tabulate the result. I will get 48/3 = 16 "added" columns. I will finally insert a first column back.
This is what I am trying to implement.

By quoting the variables, do I have to replace every $var with "$var"?

Input:

% Created 16/09/14 At 16.04.46 < none > SIGNAL   "Direct signal, group   1     "

  Group 1 consists of:
     Wire 1 with label W at (x,y)=(0,0) and at 1600 V
 Number of signal records:  1000
 Units used: time in second, current in Ampere.
 .STIMULUS signal PWL
 + TIME_SCALE_FACTOR =  0.100E-11
 + VALUE_SCALE_FACTOR =  0.100E-11
 + (  0.00000000E+00   0.00000000E+00
 +     0.30000002E-09   0.00000000E+00
### 996 more rows of data here
 +     0.29940000E-06   0.00000000E+00
 +     0.29970002E-06   0.00000000E+00 )
% Created 16/09/14 At 16.04.46 < none > SIGNAL   "Cross-talk, group   1        "

  Group 1 consists of:
     Wire 1 with label W at (x,y)=(0,0) and at 1600 V
 Number of signal records:  1000
 Units used: time in second, current in Ampere.
 .STIMULUS signal PWL
 + TIME_SCALE_FACTOR =  0.100E-11
 + VALUE_SCALE_FACTOR =  0.100E-11
 + (  0.00000000E+00   0.00000000E+00
 +     0.30000002E-09   0.00000000E+00
### 996 more rows of data here
 +     0.29940000E-06   0.00000000E+00
 +     0.29970002E-06   0.00000000E+00 )

Output:

0.00000000E+00   0.00000000000E+00 ###14 more columns here  0.00000000000E+00 
0.30000002E+00   0.00000000001E+00 ###14 more columns here  0.00000000001E+00 
### 996 more rows of data here
0.29940000E+00   0.00000000005E+00 ###14 more columns here  0.00000000005E+00 
0.29970002E+00   0.00000000002E+00 ###14 more columns here  0.00000000002E+00 

You could try this to get at the (two in your sample file) data columns (unfortunately all zeroes in your sample file); I didn't quite understand what averaging you wanted to achieve, so I leave it up to you. Give it a try:

awk     '/Direct signal/        {L=1; RCnt++}
         /Cross-talk/           {L=0}
         /Number of/            {NoS=$NF}
         L && / \+ \(/          {SoL=NR; EoL=NR+NoS-1}
         NR<=EoL                {sub (/\)/,_); TM[NR-SoL]=$(NF-1); DT[NR-SoL,RCnt]=$NF/NoS}
         END                    {for (i=0; i<NoS; i++)
                                        {printf "%14.8E", TM
                                         for (j=1; j<=RCnt; j++) printf "\t%14.8E", DT[i,j]
                                         printf "\n"
                                        }
                                }
        ' /tmp/SampleDataFile.pwl.txt
0.00000000E+00  0.00000000E+00  0.00000000E+00
3.00000020E-10  0.00000000E+00  0.00000000E+00
6.00000050E-10  0.00000000E+00  0.00000000E+00
9.00000020E-10  0.00000000E+00  0.00000000E+00
1.20000010E-09  0.00000000E+00  0.00000000E+00
1.50000000E-09  0.00000000E+00  0.00000000E+00
1.80000000E-09  0.00000000E+00  0.00000000E+00
2.10000020E-09  0.00000000E+00  0.00000000E+00
2.40000020E-09  0.00000000E+00  0.00000000E+00
2.70000000E-09  0.00000000E+00  0.00000000E+00
3.00000000E-09  0.00000000E+00  0.00000000E+00
3.30000030E-09  0.00000000E+00  0.00000000E+00
. . .
1 Like

Hi RudiC,

Thanks for your reply! Your code is far neater than mine. I have to spend some time on understanding it first.

Thanks,
Raymond

Thank you so much for offering the following code. That is so neat and useful for me. I have understood it with some online tutorial. awk is great.

Raymond

Didn't the original description call for 17 columns of output:

0.00000000E+00   0.00000000000E+00 ###14 more columns here  0.00000000000E+00 
0.30000002E+00   0.00000000001E+00 ###14 more columns here  0.00000000001E+00 

I can't see how the above awk code is doing what you describe.

I think the following line does.
As awk scans through the lines, RCnt will increase up to 17 finally.

for (j=1; j<=RCnt; j++) printf "\t%14.8E", DT[i,j]

OK I see now it may need some tweaking to do the 3 consecutive blocks summing, I believe this will currently display all 48 blocks.

Oh... you are right...