Hello all, newbie here. I've searched the forum and found many "how to split a text file" topics but none that are what I'm looking for.
I have a large text file (~15 MB) in size. It contains a variable number of "paragraphs" (for lack of a better word) that are each of variable length. A paragraph might be 2 lines long, or it might be 2000 lines long, or anything in between. Each paragraph begins with the same string of text in its first line, and is preceded by a blank line. There could be random blank lines throughout each paragraph. The "paragraph start" string ONLY appears at the start of each paragraph and never anywhere else.
I need a script that will read this huge text file, and save each paragraph out as a separate text file with some kind of unique name.
For example, if our big file contains:
Paragraph start. sdfgsdfgsdfggggggggggggggggggggggggggggggggg
dddddddddddddfgsddddddddddddddddddddddddddd
gfdsssssssssssssssssssssssssssssssssssssssssssssssss
33333333333333333333333333333333333333333333333
Paragraph start. gfdsdfgsdfgsdfgsdfdssssssssssssssssssssssssssssffffff
fgfdssssssssssssssssssssssssssssssssssssssssssssssss
gfdsssssssssssssssssssssssssssssssssssssssssssssssss
gfdsssssssssssssssssssssssssssssssssssssssssssssssss
gfdsdrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
gsssssssssssssssssssssssssssssssssssssssssssssssssss
kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
Paragraph start. gfdsdfggggggggggggggggggggggggggggggggggggg
5555555555555555555555555555555555555555555555
I need it to read this big file, and produce the following separate text files:
Output file 1:
Paragraph start. sdfgsdfgsdfggggggggggggggggggggggggggggggggg
dddddddddddddfgsddddddddddddddddddddddddddd
gfdsssssssssssssssssssssssssssssssssssssssssssssssss
33333333333333333333333333333333333333333333333
Output file 2:
Paragraph start. gfdsdfgsdfgsdfgsdfdssssssssssssssssssssssssssssffffff
fgfdssssssssssssssssssssssssssssssssssssssssssssssss
gfdsssssssssssssssssssssssssssssssssssssssssssssssss
gfdsssssssssssssssssssssssssssssssssssssssssssssssss
gfdsdrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
gsssssssssssssssssssssssssssssssssssssssssssssssssss
kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
Output file 3:
Paragraph start. gfdsdfggggggggggggggggggggggggggggggggggggg
5555555555555555555555555555555555555555555555
It seems like a simple problem, but it is above the reach of my modest shell scripting skills.
Thanks in advance!