Complicated SED search required

Hi All,

I'm trying to extract all the description fields from a MIB file which contain multiple instances of the following text:

        ENTERPRISE compaq
        VARIABLES  { sysName, cpqHoTrapFlags, cpqSsBoxCntlrHwLocation,
                     cpqSsBoxCntlrIndex, cpqSsBoxBusIndex, cpqSsBoxVendor,
                     cpqSsBoxModel, cpqSsBoxSerialNumber, cpqSsBoxFanStatus }
        DESCRIPTION
           "Storage System fan status change.

            The agent has detected a change in the Fan Status of a storage
            system.  The variable cpqSsBoxFanStatus indicates the current
            fan status.

            User Action: If the fan status is degraded or failed, replace
            any failed fans."

              --#TYPE "Fan Status Change (8026)"

Ideally I need to search through the document, and for each instance of 'DESCRIPTION' found, I need to extract the text found immediately after it in quotes, and join them onto a single line. So for example the above would produce something like:

Storage System fan status change. The agent has detected a change in the Fan Status of a storage system.  The variable cpqSsBoxFanStatus indicates the current fan status. User Action: If the fan status is degraded or failed, replace any failed fans.

Each instance should be on a seperate line. I'm getting nowhere with this, as my grasp of sed/awk is basic to say the least.

If someone could help me out here, it would literally save me hours or even days in monotonous work!

tia.

Perhaps, as a start...

awk '
 { $1 = $1 }
 /DESCRIPTION/ { ORS = " "; P = 1; next }
 /"$/ && P { ORS = RS; sub( /"$/, "" ); print; P = 0 }
 { sub( /^"/, "" ) }
 P
' 

Storage System fan status change.  The agent has detected a change in the Fan Status of a storage system. The variable cpqSsBoxFanStatus indicates the current fan status.  User Action: If the fan status is degraded or failed, replace any failed fans.

tr, and sed can do the trick...

$ tr '\n' ' ' < t1 | sed -rn 's/.*(DESCRIPTION[^"]*")([^"]*)".*/\2/p;' | sed -rn 's/[ \t]+/ /gp'
Storage System fan status change. The agent has detected a change in the Fan Status of a storage system. The variable cpqSsBoxFanStatus indicates the current fan status. User Action: If the fan status is degraded or failed, replace any failed fans.

Note: t1 should have the specified content.

if you have gawk

awk 'BEGIN{RS="--#TYPE"}
NR==1{
 gsub(/.*DESCRIPTION/,"")
 gsub("\n"," ")
 gsub(/ +| +$/," ")
 print 
}' file

or line by line processing without slurping whole file

awk '/--#TYPE/{f=0}
/DESCRIPTION/{f=1; next}
f&&!/--#TYPE/{
    gsub(/ +| +$/," ")
    printf $0
}' file
gawk '/DESCRIPTION/{lp=1;next}
/TYPE/{lp=0;next}
lp {gsub(/[ ][ ]*[\t]*/," ",$0); print}
' ORS=" "  file.txt

:D:D:D:D

---------- Post updated at 04:27 AM ---------- Previous update was at 04:13 AM ----------

small modification to let the code print every paragraph in separate line.

gawk '/DESCRIPTION/{lp=1;next}
/TYPE/{lp=0; print "\n" ;next}
lp {gsub(/[ ][ ]*[\t]*/," ",$0); print}
' ORS=" "  file.txt

What do you mean a start :D. It seems to work really well, also for multiple MIB descriptions plus it leaves out the double quotes. I think I managed to shorten it a little:

awk '
  { $1 = $1 }
  /DESCRIPTION/ { p=" "; next }
  p { p=p" "$0; if (/"$/) {gsub(/ *"/,"",p); print p;p=""}}
' infile

This code is printing the first DESCRIPTION paragraph but if there 2 or more this code will print only the first paragraph.

ex:-
paragraph..

ENTERPRISE compaq
        VARIABLES  { sysName, cpqHoTrapFlags, cpqSsBoxCntlrHwLocation,
                     cpqSsBoxCntlrIndex, cpqSsBoxBusIndex, cpqSsBoxVendor,
                     cpqSsBoxModel, cpqSsBoxSerialNumber, cpqSsBoxFanStatus }
        DESCRIPTION
           "Storage System fan status change.

            The agent has detected a change in the Fan Status of a storage
            system.  The variable cpqSsBoxFanStatus indicates the current
            fan status.

            User Action: If the fan status is degraded or failed, replace
            any failed fans."

              --#TYPE "Fan Status Change (8026)"

DESCRIPTION
000000000000000000000000
3333333333333333333333
211111111111111111

--#TYPE i7i7i7i

output is

Storage System fan status change.  The agent has detected a change in the Fan Status of a storage system. The variable cpqSsBoxFanStatus indicates the current fan status.  User Action: If the fan status is degraded or failed, replace any failed fans.

so the code need modification...it is a start as scottn said.
BR

Hi Ahmad, scottn's code works well. In your example you left out the double quotes after the second DESCRIPTION, which are part of the format.

You are right man my bad :p:p:p

many many thanks to everyone that replied, the first bit of code I tried worked a treat, so stuck with that for the moment:

awk '
  { $1 = $1 }
  /DESCRIPTION/ { p=" "; next }
  p { p=p" "$0; if (/"$/) {gsub(/ *"/,"",p); print p;p=""}}
' infile

I'll be looking through the rest of the given examples too - it's definitley a useful skill to have. You guys are a real lifesaver!

Wow, I didn't realize hospitals were using SNMP in their cardio defibrilators!