using awk to extract text between two constant strings

Hi,

I have a file from which i need to extract data between two constant strings.
The data looks like this :

Line 1 SUN> read db @cmpd unit 60
Line 2 Parameter: CMPD -> "C00071"
Line 3
Line 4 SUN> generate
Line 5 tabint>ERROR: (Variable data)

The data i need to extract is Line 2 but only when Line 1 is "SUN> read db @cmpd unit 60"
and Line 5 is "tabint>ERROR:"

The following cant print Line 2, moreover i am specific that my start and end string belong to line i and i+4.

awk '/SUN> read db @cmpd unit 60/,/tabint>ERROR/ {print}' database_energy.out > tf4

Any help will be greatly appreciated.

-Manali

In other words what i want to do is the following:

if row number "i" = /SUN> read db @cmpd unit 60/
and row number "i+4" = /tabint>ERROR:/
print row number "i+1"

IF the file REALLY has "Line 1", etc. in it:

awk ' arr[$1 $2]=$0
       if(arr[$1 $2]=="Line5"  &&
          arr["Line1"]=="SUN> read db @cmpd unit 60" &&              
          arr["Line5"]=="tabint>ERROR:")
          {
              print arr["Line2"]
           }
         ' inputfilename 

The arr[$1 $2] concatenates the two fields, "Line 1" becomes "Line1"

I'm sorry i wrote those Line numbers to explain what i wanted to do.
What i would like to do is

  if row number "i" = /SUN> read db @cmpd unit 60/
  and row number "i\+4" = /tabint>ERROR:/
  print row number "i\+1"

Try this:

awk '
/^SUN> read db @cmpd unit 60/ || c {
  c++
  if(c==2) {
    var=$0
  }
  if(c==5 && /^tabint>ERROR:.*/) {
    print var
    c=0
  } 
}' file

Regards

awk '/SUN> read db @cmpd unit 60/ {c = 3; getline; s = $0 }
!c-- && /tabint>ERROR:/ { print s }' input

Use nawk or /usr/xpg4/bin/awk on Solaris.

[n]awk '/SUN> read db @cmpd unit 60/,/tabint>ERROR:/ {
     if (last_line == "SUN> read db @cmpd unit 60")
        print 
     last_line = $0
}' file

Thanks to all who replied, however only Radoulov's script worked.
Radoulov, can you please help me understand your script ?

Yes.

'/SUN> read db @cmpd unit 60/ {c = 3; getline; s = $0 }

If the current record matches the pattern "SUN> read db @cmpd unit 60",
set the parameter c to 3, read the next record and save it in the parameter s.

!c-- && /tabint>ERROR:/ { print s }

If the value of the parameter c is not true (is zero) and the current record matches the pattern "tabint>ERROR:",print s.

If the expression !c-- is not clear,
consider this:

% cat file
Line 1 SUN> read db @cmpd unit 60
Line 2 Parameter: CMPD -> "C00071"
Line 3
Line 4 SUN> generate
Line 5 tabint>ERROR: (Variable data)

% awk '/SUN> read db @cmpd unit 60/ { 
print "c is:", c, "$0 is:", $0; c = 3; getline; s = $0 }
{ print "c is:", c, "$0 is:", $0; c-- }' file
c is:  $0 is: Line 1 SUN> read db @cmpd unit 60
c is: 3 $0 is: Line 2 Parameter: CMPD -> "C00071"
c is: 2 $0 is: Line 3
c is: 1 $0 is: Line 4 SUN> generate
c is: 0 $0 is: Line 5 tabint>ERROR: (Variable data)

In this case:

!c-- && /tabint>ERROR:/

The first expression (left side of the logical AND) is evaluated for every record (so c is decremented).

shamrock's solution works well on my box and my solution also works fine.
Maybe a difference awk version.

Regards

I have a variant of the same question. I have been searching across number of threads, but it has not been covered, apparently.

Any input and explanation of the same would be helpful.
Sample Input
-------------
Record 1
Field 1: Data for field 1
Field 2: Data for field 2
Field 3: Data for field 3
Field 4: Data for field 4

Record 2
Field 1: Data for field 1
Field 2: Data for field 2
Field 4: Data for field 4

Record 3
Field 1: Data for field 1
Field 3: Data for field 3
Field 4: Data for field 4

---------------------------
Expected Output (The new inserts are in CAPS)
---------------------------
Record 1
Field 1: Data for field 1
Field 2: Data for field 2
Field 3: Data for field 3
Field 4: Data for field 4

Record 2
Field 1: Data for field 1
Field 2: Data for field 2
FIELD 3: <NULL>
Field 4: Data for field 4

Record 3
Field 1: Data for field 1
FIELD 2: <NULL>
Field 3: Data for field 3
Field 4: Data for field 4

Logic:
If (^Field) && ( (next occurence of ^Field) <> condition)
insert line as above
endif
endif

Is the maximum number of fields per record (4 in your example) fixed?