Problem with variable type

Dear All,

I am trying to write an script to calculate geometric centre of selected residues of a protein structure. Say I have a PDB file (a text containing the x, y, and z coordinates of atoms of a protein). I want to extract the X coordinates of the atoms belonging to the interested residues and get the Max and Min values for X coordinates and then calculate the X coordinate for the geometric centre as followings: X(cent) = (X(max) - X(min))/2. The same will go for Y(cent) and Z(cent). To do this it is better to have something to read in the X coordinates for the interested atoms from the X coordinate column in the PDB file and find the X(max) and X(min). The way that I do this is to read these values and put them in a file (say temp.dat). Then I sort the file ascending and descending based on different columns (i.e., x, y, and z colunms) and extract the max and min for X, Y and Z as you can see below:

****************************************************

#/bin/csh
echo -n "name of file (full path; incl. extension):"
set INFILE = $<
echo -n "name of output file (full path):"
set OUTFILE = $<
echo -n "Specify the starting residue number:"
set START = $<
echo -n "Specify the last residue number:"
set END = $<
@ END ++
@ START --
#
#
#
#
awk ' ($1 == "ATOM" && ( $5 > '$START' ) && ( $5 < '$END' )){{printf("%10.3f%10.3f%10.3f\n", $6, $7, $8)}};' $INFILE >>! $OUTFILE
#
sort $OUTFILE | set Xmin = `head -1 | awk '{printf("%10.3f\n", $1)};'`

sort -n +1 $OUTFILE | set Ymin = `head -1 | awk '{printf("%10.3f\n", $2)};'`

sort -n +2 $OUTFILE | set Zmin = `head -1 | awk '{printf("%10.3f\n", $3)};'`

sort -r $OUTFILE | set Xmax = `head -1 | awk '{printf("%10.3f\n", $1)};'`

sort -rn +1 $OUTFILE | set Ymax = `head -1 | awk '{printf("%10.3f\n", $2)};'`

sort -rn +2 $OUTFILE | set Zmax = `head -1 | awk '{printf("%10.3f\n", $3)};'`

echo $Xmin $Xmax $Ymin $Ymax $Zmin $Zmax
exit 1
**************************************************

The problem is that I can't use the Xmin, Xmax, Ymin, ... to calculate X(cent), Y(cent), and Z(cent), using "expr" command. E.g., for the following command, I get the following error:

>expr $Xmin - $Xmin

non-numeric argument

I guess there is something wrong with these variables and they are character strings and not numbers. I will appreciate it if you give me some input.

Cheers, Siavoush
:confused:

For starters, expr only works with integers.

I'm not a csh expert, but your syntax looks wrong. I would have expected that you need to do something like:

set Xmin = `sort $OUTFILE | head -1 | awk '{printf("%10.3f\n", $1)};'`

Your algorithm is extremely inefficient. You should not sort the file at all. Initialize things by setting your min to a very large number and your max to a very small number (negative, but with a large absolute value). Then loop reading each value. If the value is less than the current min, it becomes the new min. If the value is larger than the current max, it becomes the new max. You should be able to get all 6 extrema with one pass of the unsorted file.

I would definately switch languages, probably to C. You really want a language with built in floating point support.

But you can do floating arithmetic by using bc. You need to echo in expressions and read the result. Try this command:
echo 1.5 + .5 | bc
to see what I mean.

Dear Perderabo

Many thanks for your help. The bc command worked for me and script running ok. If I have some free time will work on it according to your suggestions in the C programming environment.

Cheers, Siavoush
:slight_smile: