how to use split command in unix shell with a condition

Hi all,
I have a file which I want to split into several files based on a condition. This files has several records. I want one record per file. Each record ends with a //. So, I want to separate files based on this condition. I want split files to be named with the name across the field ID (for example: AACL2_BRAJA in the name give below). Also, I want the main file to be as such.

Input
  ID   AACL2_BRAJA             Reviewed;         334 AA.
AC   Q89GR3;
DT   30-NOV-2010, integrated into UniProtKB/Swiss-Prot.
DT   01-JUN-2003, sequence version 1.
DT   11-JUL-2012, entry version 42.
DE   RecName: Full=Amino acid--[acyl-carrier-protein] ligase 2;
RP   FUNCTION, CATALYTIC ACTIVITY, SUBSTRATE SPECIFICITY, COFACTOR, KINETIC
RP   PARAMETERS, AND SUBUNIT.
RC   STRAIN=USDA 110;
RX   PubMed=20663952; DOI=10.1073/pnas.1007470107;
RA   Mocibob M., Ivic N., Bilokapic S., Maier T., Luic M., Ban N.,
RA   Weygand-Durasevic I.;
RT   "Homologs of aminoacyl-tRNA synthetases acylate carrier proteins and
RT   provide a link between ribosomal and nonribosomal peptide synthesis.";
RL   Proc. Natl. Acad. Sci. U.S.A. 107:14585-14590(2010).
CC   -!- FUNCTION: Catalyzes the ATP-dependent activation of L-glycine and
CC       its transfer to the phosphopantetheine prosthetic group covalently
CC       attached to the vicinal carrier protein blr6284 of yet unknown
CC       function. May participate in nonribosomal peptide synthesis or
CC       related processes. L-alanine is a poor substrate whereas L-serine
MNLAIVEAPA DSTPPPADPL DHLADALFHE MGSPGVYGRT ALYEDVVERI AAVISRNREP
     NTEVMRFPPV MNRAQLERSG YLKSFPNLLG CVCGLHGIES EIDAAISRFD AGGDWTESLS
     PADLVLSPAA CYPLYPIAAS RGPVPAAGWS FDVAADCFRR EPSRHLDRLQ SFRMREFVCI
     GSADHVSAFR ERWIIRAQKI ARDLGLTFRI DHANDPFFGR VGQMMAVSQK QLSLKFELLV
     PLRSEERPTA CMSFNYHRDH FGTTWGIVDA AGEPAHTACV AFGMDRLAVA MFHTHGKDVA
     LWPIAVRDLL GLAQTDRGAP SAFEEYRCAK EAGS
//
Expected output
split files named as  their ID (in this case AACL2_BRAJA.txt)
files separated when there is //

Any help would be much appreciated. Thanks in advance.

I think this will do what you are wanting:

awk '
    $1 == "ID" {
        if( fname )
            close( fname );
        fname = $2 ".txt";
    }
    { print >fname; }
' input-file

It assumes that there is no "ID" as the first token other than on records that identify the next filename.

1 Like

Thanks. Worked like a wonder.:b: