I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this:
HMMER3/b [3.0 | March 2010]
NAME 1-cysPrx_C
ACC PF10417.4
DESC C-terminal domain of 1-Cys peroxiredoxin
LENG 40
ALPH amino
RF no
CS yes
MAP yes
.....more data...
0.00103 6.88015 * 0.61958 0.77255 0.00000 *
//
HMMER3/b [3.0 | March 2010]
NAME 120_Rick_ant
ACC PF12574.3
DESC 120 KDa Rickettsia surface antigen
LENG 255
ALPH amino
RF no
CS no
MAP yes
DATE Tue Sep 27 11:43:56 2011
NSEQ 7
... etc..
Each record starts with HMMER3/b and ends with //
I would like each individual file named after the ACC field, such as PF10417.4 or PF10417 (the . doesn't matter)
Any clues?