Columns in textfile merged, how to add space inbetween? (coordinaates)

Hello!

I have a problem with a text file containing atom coordinates. After a converting step using some software, the structure of the textfile got a little messed up. Some columns merged, because the numbers got too large. My text file has 119�000 lines like:

 POPC  [number]  [x-coordinate]  [y-coordinate]  [z-coordinate]  1.00  0.00  C

If the textfile looks normal (see below), I can use �awk� to extract each column with coordinates (column 3, 4 and 5) into a new text file to manipulate them further using excel. (I calculate the mass center of all atoms)

Ordinary part of the text file (note that there are spaces between the coordinates, that�s what I need):

 POPC  204     -40.966  -5.183  -0.747  1.00  0.00           C   
 POPC  205      -0.942  15.531  -1.471  1.00  0.00           C   
 POPC  206      -3.101   9.362   0.404  1.00  0.00           C   
 POPC  207     -54.098 -18.036  -0.198  1.00  0.00           C   
 POPC  208     -19.860  -6.746  -0.441  1.00  0.00           C   
 POPC  209      18.190   6.699  -2.129  1.00  0.00           C   
 POPC  210     -34.468  12.623  -0.444  1.00  0.00           C   
 POPC  211     -49.057  16.806   3.756  1.00  0.00           C   
 POPC  212     -48.038   0.885   1.784  1.00  0.00           C   
 POPC  213     -19.421   4.976   1.459  1.00  0.00           C   
 

As I already mentioned, some lines merged and can look like this:

 POPC    1      24.021  43.473-103.486  1.00  0.00           C   
 POPC    2      36.132 -62.075 -97.909  1.00  0.00           C   
 POPC    3      90.078  38.715-103.089  1.00  0.00           C   
 POPC    4      14.122  41.935-101.416  1.00  0.00           C   
 POPC    5      24.143 -49.821-100.995  1.00  0.00           C   

It can also have different numbers of digits/letters between the messed up lines (simple positional approach does not work).

 POPC   78      91.105   5.477-100.390  1.00  0.00           C   
 POPC    1      24.021  43.473-103.486  1.00  0.00           C   
 POPC  203     -21.043 104.805-100.465  1.00  0.00           C

Does anyone know, how to either get a space at the right place or still extract the numbers, even though they are not separated into columns anymore?
I am new to UNIX and I am already seeing me entering spaces for 5-10h.

Additional Information:
-The numbers after �POPC� go from 1-239 for 501 times.
-sample_coordinates.txt attached
-The order of the listings can not change (only within the POPC1-239 section that repeats, the order does not matter)
-This is no homework or course assignment.

Thank you very much!!:slight_smile:

Hi, try:

sed 's/\([0-9]\)-/\1 -/g' file

or simply:

sed 's/-/ -/g' file
1 Like

Thank you so much, it looks like it worked! I really need to learn the sed command, it's so powerful!
Do you just add a space before the last "-" in the line? I'm curious on how this worked.

Hi, you're welcome :slight_smile:

The first one replaces a digit followed by a minus sign with that same digit (using back reference \1 ), a space and a minus sign.

The second one just puts a space before any minus sign in the file..