Want to grep records in alphabetical order from a file and split into other files

Hi All,

I have one file containing thousands of table names in single column. Now I want that file split into multiple files e.g one file containing table names starting from A, other containing all tables starting from B...and so on..till Z.

I tried below but it did not work.

for i in {A..Z}; do grep ^$i ; done< erg.txt

Using this logic it just gives tables starting with A and then stops. While it should give all files in sequence, then I can use some logic to save them alphabetically Please advise.

here, erg.txt is the name of main file. Sample is like this

user@86340-hostname:~$ cat erg.txt | head
ABD_DET
ABS_MSTR
ABSC_DET
ABSCC_DET
ABSD_DET
ABSI_MSTR
ABSL_DET
ABSP_DET
ABSPLI_REF
ABSR_DET
awk 'NF{if(f) close(f);f=substr($1,1,1)".txt";print >>f}' erg.txt
1 Like

This proposal is nice and short and solves the problem, but in case (large chunks of) the input file is sorted, it performs too many unnecessary file open/close operations. Try a small adaption:

awk '
NF      {TMP=substr($1,1,1)".txt"
         if (FN && FN != TMP) close (FN)                 
         FN=TMP
         print >> FN
        }
' file
2 Likes

Thanks..It worked like a magic :slight_smile:

Could you please explain the logic as well? I never used FN in awk so it will be a learning for me.:slight_smile:

It extracts the first character from the first field in every line and stores it, extended by the string constant ".txt", into a temp string variable. If this differs from the old file name in variable FN, close the old file. Then assign the temp var to the file name var FN, and append the entire line to this file.

BTW, a small adaption to your own code snippet would have made it work:

for i in {A..Z}; do grep ^$i <erg.txt >$i.txt ; done

, although it would have opened and read erg.txt 26 times.

1 Like

Plus, it would have created 26 files with A to Z whether grep found content to place into or not from erg.txt.

Thanks guys! I appreciate your responses.