Grepping text block by block by using for loop

Hei buddies,
Need ur help once again.

I have a file which has bunch of lines which starts from a fixed pattern and ends with another fixed pattern.
I want to make use of these fixed starting and ending patterns to select the bunch, one at a time.

The input file is as follows.

Hi welcome
blah blah blah
blah blah blah
Bye**
Hi welcome
blah  blah
blah  blah
blah  blah
blah  blah
blah  blah
Bye**
Hi welcome
blah 
blah 
Bye**
Hi welcome
blah blah blah
blah blah blah
blah blah blah
blah blah blah
blah blah blah
Bye**
Hi welcome
blah blah blah
Bye**

I tried using awk '/Hi welcome/,/Bye**/' inputfile.txt to select text from Hi welcome to Bye** However, it selects complete document may be because even whole document starts with "Hi welcome" and ends with "Bye**". Here I am trying to get it block by block (From Hi welcome to Bye** is one block) in a temp file by using for loop.

Please help. Little urgent.

Thank you.
Anu.

Try this....

awk '{if($0 ~ /^Hi welcome/){ s=$0}else{if($0 !~ /Bye\*\*/){if(s != "") { s=s"\n"$0}}else{s=s"\n"$0; print s"\n"}}}' file > temp_file

Hi Pamu,
The solution is not working. Output file is generated of zero MB :frowning:

What is output of this..?

awk '{if($0 ~ /^Hi welcome/){ s=$0}else{if($0 !~ /Bye\*\*/){if(s != "") { s=s"\n"$0}}else{s=s"\n"$0; print s"\n"}}}' file

I don't think you can use awk to process parts of files in a loop, at least not without additional measures. Try this suggestion to create several .tmp files that you can loop through afterwards:

awk     '/^Hi welcome/ {++fn}
         {print >fn".tmp"}   
         /Bye\*\*/ {close (fn".tmp")}
        ' infile

You could even leave out the /Bye.../ line if there's not too many files open...

Hi Pamu, once i pressed "Enter" key it returned to $ prompt without giving any output on screen.

Hi Rudic, the solution that you have given is working fine but :frowning: I have more than a million records ka file to process. So its difficult to follow your suggestion. Any better way of doing it.

Please help
Anu.

Well, here's a solution to use in a for loop (in bash!). It will not be too performant on large files, as awk always scans through the entire file!

for ((i=1;i<=5;i++))
do
echo Block: $i
awk     '/^Hi welcome/ {++fn}
     {if (fn==blockno) print}
    ' blockno=$i test
done

Redirect the output if you're happy with the result.

Hi Rudic,

Using for loop it will take too much of time and every time awk reads file from start. instead of using for loop we can use awk directlly.

try this..

awk '{if($0 ~ /^Hi welcome/){a++; s=$0}else{if($0 !~ /Bye\*\*/){if(s != "") { s=s"\n"$0}}else{s=s"\n"$0; print "Block : "a"\n"s"\n"}}}' file

another approach..

awk '{if($0 ~ /Bye\*\*/){a++;s=s"\n"$0;print "Block : "a"\n"s"\n";s=""}else{if(s){s=s"\n"$0}else{s=$0}}}' file

If you don't have any escape sequence in "blah blah blah", try this (made in bash):

par=""
while IFS= read line; do
par="${par}${line}\n"
if [ "$line" = "Bye**" ]; then
 par=${par%\\n}
 echo -e "${par}" >tmpfile
 cat tmpfile # or do whatever you like with it
 par=""
fi
done <inputfile

--
Bye

Dear RudiC,

Sorry, it may sound foolish but may I know where to enter input file name and what will be output file name? Also, wondering where is "Bye**" pattern is placed in the script.

Sorry, the input filename I used is "test" - pls replace. The ouput is sent to stdout, you can use the redirection ">" to send it to e.g. "tempfile" or whatever filename you like. We dont need the "Bye**" pattern if the input file is structured like you posted: "Bye**" immediately followed by the next "Hi welcome".

Wow Len RudiC and Pamu,

Thanks for your efforts and prompt help which was badly needed.
Special thanks to Len, it worked exactly how I wanted it to work.

Thank you once again.
God bless you all
Take care
Anu.

@pamu: I recognize this and commented on it, but the requestor asked to supply the blocks into a for loop: