[BASH] Problem with a sed -n statement

Hey all,

So I've been banging my head against this for a few hours (:wall:) and I can't see whats wrong with it, each part seems to work fine on its own when entered at command line, but then it falls down when pulled together.

I'm writing a script to translate fractional atomic coordinates to cartesian coordinates, itself a simple mathematical thing, but the input format is giving me trouble. The code should read in a .xyz file, which has 2 comment lines and then so many lines of atom x y z like so

48
Optimisation video
H 1.0000 1.0000 1.0000
C 1.2000 1.3000 1.3333
.
.
.
48
Optimisation Video
H 1.2000 1.2000 1.2000
C 2.0000 2.0000 2.0000
.
.
.

before being repeated (two comment lines etc) in a number of frames ($NoFrames), building up an animation of how the atoms move.

Ultimately I need to pull out each atom line and disregard the comment lines. I thought what I'd written (below) should do that but for some reason my sed -n statement always pulls out the comment lines as well

# $NoFrames and $NoAtoms are defined earlier in the code and do not change
#
# Start of the systematic loop, outer deals with frames, inner with atoms
#
FrameNumber=0 # Used to know which line the sed statement should strip out
FrameCounter=1 # Keeps a track of which frame we're analysing
AtomCounter=1 # Keeps a track of which atom (within each frame) we're analysing
while [[ $FrameCounter -le $NoFrames ]]; do
        while [[ $AtomCounter -le $NoAtoms ]]; do
                let "AdjustedAtomCounter=(($FrameNumber*($NoAtoms+2))+($AtomCounter+2))" # This line should allow me to skip the first two lines per frame (the comment lines)
                sed -n "${AdjustedAtomCounter}p" *.xyz
                let "AtomCounter=$AtomCounter+1"
        done
echo $NoAtoms >> Cartesian.xyz
echo "Cartesian GeomOpt Video" >> Cartesian.xyz
let FrameCounter=FrameCounter+1
let FrameNumber=FrameNumber+1
let AtomCounter=1
done
rm *.txt
exit

I've stripped all the code back to this, If i can get this part to work i can send the values off to be processed and then written into the output (Cartesian.xyz) file. Any help with this would be greatly appreciated and I hope I've made the problem as clear as I can, but please do tell me if I need to elaborate on something.

Cheers!

From your text, it seems that given input like this:

comment
comment
data
data
data
data
comment
comment
data
data
data
data

You want an output file (Cartesian.xyz) to contain:

Cartesian GeomOpt Video
transformed data
transformed data
transformed data
transformed data
Cartesian GeomOpt Video
transformed data
transformed data
transformed data
transformed data

I also assume that the transformation can be applied to each line independently (i.e. you don't need the whole frame's worth of data to do the transformation. If that is true, then might I suggest something along the lines of the following script to read and transform the data. Looking at your code, it seems to me you are doing a lot of unnecessary rereading of the input file with each loop skipping what has already been processed.

#!/usr/bin/env ksh

# converts coordinates from one form to another and
# echos translated values
function transform
{ 
    echo "$(( $1 * 300 ))  $(( $2 * 200 )) $(( $3 /42 ))"
}

natoms=5       # constant number of atoms/frame
nframes=2       # number of frames in the input data
data=t37.data   # name of the data file (my test file was t37.data)

for (( j = 0; j < nframes; j++ ))         # for each frame
do
    read junk       # read and ignore comment lines 
    read junk

    echo "Cartesian GeomOpt Video"  # looks like this marks the frame in new file????
    for (( i = 0; i < natoms; i++ ))     # for each atom in the frame
    do
        read atom_letter x y z           # read its data 
        echo "$atom_letter $(transform $x $y $z)"   # transform and wriite 
    done
done <$data >Cartesian.xyz   # all echos in the loop write to cart.xyz

Again, this was just a guess at what your goal is, and a big assumption that each atom's coords can be translated independently of the others in the frame.

As for the transformation. I stuck in a dummy function to illustrate it. I don't know if you have an external programme that needs to be called, or if it's a function in your script that computes the transformation. Regardless, the assumption that I made was that the process/function echoes the changed x,y,z values which can then be echoed by the statement in the while loop.

Hope this makes some sense, and that I'm not too far off the mark.

1 Like

With regards to your sed statement:

sed -n "${AdjustedAtomCounter}p" *.xyz

In this case the sed statement will regard the input files as a concatenated input stream, so if you print line 2 for example it will only be line 2 of the first file, because if the files are 100 lines long for example, the second line in file 2 would be line 102 to sed...

2 Likes

Agama, Scrutinizer thanks for your posts, The bulk of the program involves stripping away ascii formatting from the output file of a scientific package and is all written in bash, so I'lll translate that concept into bash and get back if there are any major issues, thanks very much for taking time to help!