awk used to extract data between text

Hello all,
I have a file (filename.txt) with some data (in two columns X and Y) which looks like this:

##########
'Header1'
'Sub-header1'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

'Sub-header2'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

'Sub-header3'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

#######
'Header2' 
'Sub-header1'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

'Sub-header2'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

'Sub-header3'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

...and so on...

So, the three different 'Sub-headers' under each different header are the same (the same three every time)..., so what I want is to extract the data that is between the 'Sub-headers', what I am doing right now is to apply the following command:

awk '/Sub-header1/ {getline;getline}{j++}j==1{flag=1;next} /Sub-header2/ {i++}i==1{flag=0} flag {print}' filename.txt > ofile.txt

I am using the {getline;getline} commands to skip the lines of the 'Sub-header1' and 'X Y', but although it does skip those two lines, it also prints the 'Header1' (and this is something I really don't get) and the data I wanted to have.
The reason I want to have just the data is that I want to use it to make a plot with python... (but that's another story). I also would like to get rid of the blank line at the bottom of the set of data that I am extracting, and I tried using instead of the second pattern ('Sub-header2') the blank line (\/n) but it didn't worked.
I've been told not to "abuse" of the getline command since sometimes (unless I really understood what it does) it can give unexpected results, I found also the option of using 'c&&!--c;/Sub-header1/ {c=3} etc... to tell to skip to the third line after the pattern (Sub-header1) but this gives me something even more unexpected.
Hopefully someone followed me until this point :),
Thank you very much!

What do you actually want to print?
The following prints all sections following /Sub-header1/; it stops printing when it meets an empty line, /^$/:

awk '/Sub-header1/ {getline;getline;flag=1} /^$/ {flag=0} flag {print}' filename.txt

Without getline:

awk '/Sub-header1/ {flag=1;c=3} /^$/ {flag=0} flag && !(c && --c) {print}' filename.txt

Thanks for your reply, what I want to print is the data that appears following the first 'Sub-header1' and up to before the 'Sub-header2' that's why I added the counters

 {j++}j==1

, (and then I will modify it to print into a second file the contents of the data between the second set of 'Sub-header1' 'Sub-header2', by changing

j==1

to

 j==2

... I tried using the line you gave me, and I see what it does, it prints all the sets of data between this patterns together... I will try now adding my counters to see if I get what I wanted.
Thanks,

Below your example; my awk script will print the lines with <this

##########
'Header1'
'Sub-header1'
X                    Y
xxxx.xx       yyyy.yyy   <this
xxxx.xx       yyyy.yyy   <this
....                 ... <this

'Sub-header2'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

'Sub-header3'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

#######
'Header2' 
'Sub-header1'
X                    Y
xxxx.xx       yyyy.yyy   <this
xxxx.xx       yyyy.yyy   <this
....                 ... <this

'Sub-header2'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...

'Sub-header3'
X                    Y
xxxx.xx       yyyy.yyy
xxxx.xx       yyyy.yyy
....                 ...
1 Like

Thanks for the explanation, now, what can I do if I want to print only the first set of lines with

<this

or only de second set of lines with

<this

?
Thanks again!

This prints the 2nd occurrence:

awk '/Sub-header1/ && ++n==2 {flag=1; c=3} /^$/ {flag=0} flag && !(c && --c) {print}' filename.txt

You also can give the search criteria as additional arguments:

awk '$0~search && ++n==num {flag=1; c=3} /^$/ {flag=0} flag && !(c && --c) {print}' search="Sub-header1" num=2 filename.txt
1 Like

Thank you so much, I spend hours yesterday trying to figure this out myself! using awk is fun, and simplifies a lot work (when you know how to use it, but on the mean time, it can be painful after some hours of try and error).
Thanks!

Instead of counting the occurrences, you could set another scope instance.
This time I broke the awk code into a multi-line, IMHO better readable.
And I have introduced next , that directly starts a new cycle.
That means c=2 not 3 because the next 2 lines are to be skipped.

awk '$0~header {n=1; next}
n && $0~subheader {n=0; flag=1; c=2; next}
/^$/ {flag=0}
flag && !(c && --c) {print}
' header="Header2" subheader="Sub-header1" filename.txt