Hi all,
I have a file which I want to split into several files based on a condition. This files has several records. I want one record per file. Each record ends with a //. So, I want to separate files based on this condition. I want split files to be named with the name across the field ID (for example: AACL2_BRAJA in the name give below). Also, I want the main file to be as such.
Input
ID AACL2_BRAJA Reviewed; 334 AA.
AC Q89GR3;
DT 30-NOV-2010, integrated into UniProtKB/Swiss-Prot.
DT 01-JUN-2003, sequence version 1.
DT 11-JUL-2012, entry version 42.
DE RecName: Full=Amino acid--[acyl-carrier-protein] ligase 2;
RP FUNCTION, CATALYTIC ACTIVITY, SUBSTRATE SPECIFICITY, COFACTOR, KINETIC
RP PARAMETERS, AND SUBUNIT.
RC STRAIN=USDA 110;
RX PubMed=20663952; DOI=10.1073/pnas.1007470107;
RA Mocibob M., Ivic N., Bilokapic S., Maier T., Luic M., Ban N.,
RA Weygand-Durasevic I.;
RT "Homologs of aminoacyl-tRNA synthetases acylate carrier proteins and
RT provide a link between ribosomal and nonribosomal peptide synthesis.";
RL Proc. Natl. Acad. Sci. U.S.A. 107:14585-14590(2010).
CC -!- FUNCTION: Catalyzes the ATP-dependent activation of L-glycine and
CC its transfer to the phosphopantetheine prosthetic group covalently
CC attached to the vicinal carrier protein blr6284 of yet unknown
CC function. May participate in nonribosomal peptide synthesis or
CC related processes. L-alanine is a poor substrate whereas L-serine
MNLAIVEAPA DSTPPPADPL DHLADALFHE MGSPGVYGRT ALYEDVVERI AAVISRNREP
NTEVMRFPPV MNRAQLERSG YLKSFPNLLG CVCGLHGIES EIDAAISRFD AGGDWTESLS
PADLVLSPAA CYPLYPIAAS RGPVPAAGWS FDVAADCFRR EPSRHLDRLQ SFRMREFVCI
GSADHVSAFR ERWIIRAQKI ARDLGLTFRI DHANDPFFGR VGQMMAVSQK QLSLKFELLV
PLRSEERPTA CMSFNYHRDH FGTTWGIVDA AGEPAHTACV AFGMDRLAVA MFHTHGKDVA
LWPIAVRDLL GLAQTDRGAP SAFEEYRCAK EAGS
//
Expected output
split files named as their ID (in this case AACL2_BRAJA.txt)
files separated when there is //
Any help would be much appreciated. Thanks in advance.