Dear All,
I would like to split a file of the following format into multiple files based on the number in the 6th column (numbers 1, 2, 3...):
ATOM 1 N GLY A 1 -3.198 27.537 -5.958 1.00 0.00 N
ATOM 2 CA GLY A 1 -2.199 28.399 -6.617 1.00 0.00 C
ATOM 3 C GLY A 1 -2.168 29.706 -5.855 1.00 0.00 C
ATOM 4 O GLY A 2 -3.205 30.358 -5.782 1.00 0.00 O
ATOM 5 H1 GLY A 2 -3.280 26.649 -6.428 1.00 0.00 H
ATOM 6 HA2 GLY A 3 -1.220 27.923 -6.579 1.00 0.00 H
ATOM 7 HA3 GLY A 3 -2.492 28.588 -7.649 1.00 0.00 H
ATOM 8 N SER A 3 -1.051 30.010 -5.194 1.00 0.00 N
ATOM 9 CA SER A 4 -1.141 30.319 -3.777 1.00 0.00 C
ATOM 10 C SER A 4 0.107 31.009 -3.229 1.00 0.00 C
ATOM 11 O SER A 5 1.081 31.273 -3.935 1.00 0.00 O
ATOM 12 CB SER A 5 -1.242 29.003 -2.978 1.00 0.00 C
ATOM 13 OG SER A 5 -2.210 28.079 -3.427 1.00 0.00 O
ATOM 14 H SER A 5 -0.165 29.571 -5.395 1.00 0.00 H
ATOM 15 HA SER A 5 -2.021 30.936 -3.581 1.00 0.00 H
ATOM 16 HB2 SER A 6 -0.271 28.504 -2.981 1.00 0.00 H
ATOM 17 HB3 SER A 6 -1.481 29.244 -1.942 1.00 0.00 H
I would like to get separate files based on the information in the 6th column:
File 1:
ATOM 1 N GLY A 1 -3.198 27.537 -5.958 1.00 0.00 N
ATOM 2 CA GLY A 1 -2.199 28.399 -6.617 1.00 0.00 C
ATOM 3 C GLY A 1 -2.168 29.706 -5.855 1.00 0.00 C
ATOM 4 O GLY A 2 -3.205 30.358 -5.782 1.00 0.00 O
ATOM 5 H1 GLY A 2 -3.280 26.649 -6.428 1.00 0.00 H
File 2:
ATOM 4 O GLY A 2 -3.205 30.358 -5.782 1.00 0.00 O
ATOM 5 H1 GLY A 2 -3.280 26.649 -6.428 1.00 0.00 H
ATOM 6 HA2 GLY A 3 -1.220 27.923 -6.579 1.00 0.00 H
ATOM 7 HA3 GLY A 3 -2.492 28.588 -7.649 1.00 0.00 H
ATOM 8 N SER A 3 -1.051 30.010 -5.194 1.00 0.00 N
File 3:
ATOM 6 HA2 GLY A 3 -1.220 27.923 -6.579 1.00 0.00 H
ATOM 7 HA3 GLY A 3 -2.492 28.588 -7.649 1.00 0.00 H
ATOM 8 N SER A 3 -1.051 30.010 -5.194 1.00 0.00 N
ATOM 9 CA SER A 4 -1.141 30.319 -3.777 1.00 0.00 C
ATOM 10 C SER A 4 0.107 31.009 -3.229 1.00 0.00 C
File 4:
ATOM 9 CA SER A 4 -1.141 30.319 -3.777 1.00 0.00 C
ATOM 10 C SER A 4 0.107 31.009 -3.229 1.00 0.00 C
ATOM 11 O SER A 5 1.081 31.273 -3.935 1.00 0.00 O
ATOM 12 CB SER A 5 -1.242 29.003 -2.978 1.00 0.00 C
ATOM 13 OG SER A 5 -2.210 28.079 -3.427 1.00 0.00 O
ATOM 14 H SER A 5 -0.165 29.571 -5.395 1.00 0.00 H
ATOM 15 HA SER A 5 -2.021 30.936 -3.581 1.00 0.00 H
File 5:
ATOM 11 O SER A 5 1.081 31.273 -3.935 1.00 0.00 O
ATOM 12 CB SER A 5 -1.242 29.003 -2.978 1.00 0.00 C
ATOM 13 OG SER A 5 -2.210 28.079 -3.427 1.00 0.00 O
ATOM 14 H SER A 5 -0.165 29.571 -5.395 1.00 0.00 H
ATOM 15 HA SER A 5 -2.021 30.936 -3.581 1.00 0.00 H
ATOM 16 HB2 SER A 6 -0.271 28.504 -2.981 1.00 0.00 H
ATOM 17 HB3 SER A 6 -1.481 29.244 -1.942 1.00 0.00 H
I would be very grateful if you could please write me a few lines of bash/awk/sed/csplit code that goes through the file and outputs multiple files. The file format given above (PDB) is used to describe 3D protein structures.
I thank you for your help in advance.
Thanks,
Tomas