awk - remove block of text, multiple actions for 'if', inline edit

mglenney · October 26, 2010, 5:52pm

I'm having a couple of issues. I'm trying to edit a nagios config and remove a host definition if a certain "host_name" is found. My thought is I would find host definition block containing the host_name I'm looking for and output the line numbers for the first and last lines. Using set, I will assign those as variables in my bash script and use another awk to output all lines before the start of the block and all the lines after the end of the block to a new file and replace the current conf with the output file.

A snippet of the Nagios conf file looks like this:

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-1-1-1
    alias        XMPP_ec2-184-1-1-1
    address        ip-10-10-10-10.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-2-2-2
    alias        XMPP_ec2-184-2-2-2
    address        ip-10-20-20-20.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-204-1-1-1
    alias        XMPP_ec2-204-1-1-1
    address        ip-10-30-30-30.us-west-1.compute.internal
    }

So, I want to remove the block containing "host_name" of "XMPP_ec2-184-2-2-2" and the blank line after it. I've come up with this one-liner which will output the line number of "define host{" and the line number of the line after the closing curly brace for that block:

awk -v hostname="XMPP_ec2-184-2-2-2" '/define host/ {startblock=NR}; {if ($1 == "host_name" && $2 ~ hostname) foundblock=1}; \
{if (/^\t\}/ && foundblock == 1) printf "%s %s\n", startblock, NR+1} {if (/^\t\}/ && foundblock == 1) foundblock=0}' nagiosconfigfile.cfg

which outputs:

8 14

This is the correct output. My questions are:

I have to put in "if (/^\t\}/ && foundblock == 1)" twice because I want it to output the line numbers and also set "foundblock=0". How can I perform multiple actions if the condition is true?
or, since I'm confident the host will only be defined once, how do I output and stop parsing the file so I don't have to reset "foundblock" to 0?
I plan on using "set --" to output that 8 and 14 to 2 variables within a bash script. Then I will go back through the file and output all lines but those with:
text awk -v startrow=$start -v endrow=$end '{if (NR<startrow || NR>endrow)}' nagiosconfigfile.cfg > newconfigfile.cfg
Do I have to go through this or is there a way to edit the file inline like you can with sed?
Lastly, is there a better way to do this?

ctsgnb · October 26, 2010, 6:32pm

echo `tr '\n' '|' <tst` | xargs -n10 echo | grep -v 'host_name XMPP_ec2-184-2-2-2' | tr '|' '\n'

[ctsgnb@shell ~]$ cat tst
define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-1-1-1
    alias        XMPP_ec2-184-1-1-1
    address        ip-10-10-10-10.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-2-2-2
    alias        XMPP_ec2-184-2-2-2
    address        ip-10-20-20-20.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-204-1-1-1
    alias        XMPP_ec2-204-1-1-1
    address        ip-10-30-30-30.us-west-1.compute.internal
    }
[ctsgnb@shell ~]$ echo `tr '\n' '|' <tst` | xargs -n10 echo | grep -v 'host_name XMPP_ec2-184-2-2-2' | tr '|' '\n'
define host{
 use linux-server,host-pnp
 host_name XMPP_ec2-184-1-1-1
 alias XMPP_ec2-184-1-1-1
 address ip-10-10-10-10.us-west-1.compute.internal

}

define host{
 use linux-server,host-pnp
 host_name XMPP_ec2-204-1-1-1
 alias XMPP_ec2-204-1-1-1
 address ip-10-30-30-30.us-west-1.compute.internal

}

[ctsgnb@shell ~]$

With a bit more formatting

[ctsgnb@shell ~]$ echo `tr '\n' '|' <tst` | xargs -n10 echo | grep -v 'host_name XMPP_ec2-184-2-2-2' | tr '|' '\n' | grep -vE '^$' | sed -e 's/^ /      /'
define host{
        use linux-server,host-pnp
        host_name XMPP_ec2-184-1-1-1
        alias XMPP_ec2-184-1-1-1
        address ip-10-10-10-10.us-west-1.compute.internal
}
define host{
        use linux-server,host-pnp
        host_name XMPP_ec2-204-1-1-1
        alias XMPP_ec2-204-1-1-1
        address ip-10-30-30-30.us-west-1.compute.internal
}
[ctsgnb@shell ~]$

agama · October 26, 2010, 6:32pm

Use curly braces to group a set of statements to execute when the expression evaluates to true:

if( hostname == "foo )
{
    print hostname;
    found = 1;
}

You can use the exit() function to cause an early exit.

if( hostname == "foo )
{
    print hostname;
    found=1;
    exit( 0 );              # exit good; use non-zero to exit bad
}

Finally, I'd have written the programme to do it all in one pass:

 awk -v remove=XMPP_ec2-184-2-2-2 '
        /define host/ {                 # start of new block; print last section
                if( snarfed )
                        print snarfed;

                snarfed = $0;           # start capture
                next;
        }

        /host_name/ {                   # check to see if unwanted target
                if( $2 == remove )      # if it is then
                        snarfed = "";   # unset to trash the whole block
                next;
        }
        snarfed {                       # if we are snarfing from current section
                snarfed = snarfed "\n" $0;      # add next line
                next;
        }
        END {                           # must do one last for the last buffered section
                if( snarfed )
                        print snarfed;
        }
' config-file >new-file
mv config-file config-file.bak
mv new-file config-file

You can pass the host name as a variable to make it more flexible. This also preserves lines from the origiinal file that aren't 'host definition' sections -- don't know if that's a requirement or not.

Hope this helps

Scrutinizer · October 26, 2010, 7:24pm

awk '!/XMPP_ec2-184-1-1-1/' RS= ORS="\n\n" infile

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-2-2-2
    alias        XMPP_ec2-184-2-2-2
    address        ip-10-20-20-20.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-204-1-1-1
    alias        XMPP_ec2-204-1-1-1
    address        ip-10-30-30-30.us-west-1.compute.internal
    }

mglenney · October 26, 2010, 7:30pm

Ok, thanks. That was making me nuts.

[quote]

 awk -v remove=XMPP_ec2-184-2-2-2 '
        /define host/ {                 # start of new block; print last section
                if( snarfed )
                        print snarfed;

                snarfed = $0;           # start capture
                next;
        }

        /host_name/ {                   # check to see if unwanted target
                if( $2 == remove )      # if it is then
                        snarfed = "";   # unset to trash the whole block
                next;
        }
        snarfed {                       # if we are snarfing from current section
                snarfed = snarfed "\n" $0;      # add next line
                next;
        }
        END {                           # must do one last for the last buffered section
                if( snarfed )
                        print snarfed;
        }
' config-file Cool.  I knew there was a way to do this without 2 awk statements.  What you wrote here is almost what I need.  It was removing the "host_name" record from all the host definitions so I removed the "next;" line from the /host_name/ section.

Your comment about other text in the file made me realize I need to support that.  I added in some text and any text before the first "define host" is not output.  Here's the test file:

$ cat testdata.cfg 
This is some text
#this is some commented text

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-1-1-1
    alias        XMPP_ec2-184-1-1-1
    address        ip-10-10-10-10.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-2-2-2
    alias        XMPP_ec2-184-2-2-2
    address        ip-10-20-20-20.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-204-1-1-1
    alias        XMPP_ec2-204-1-1-1
    address        ip-10-30-30-30.us-west-1.compute.internal
    }

This is some after text

And this is the awk:

 awk -v remove=XMPP_ec2-184-2-2-2 '
        /define host/ {
                if( snarfed )
                        print snarfed;

                snarfed = $0;
                next;
        }

        /host_name/ {
                if( $2 == remove )
                        snarfed = "";
        }
        snarfed {
                snarfed = snarfed "\n" $0;
                next;
        }
        END {
                if( snarfed )
                        print snarfed;
        }
' testdata.cfg

and this is the result:

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-184-1-1-1
    alias        XMPP_ec2-184-1-1-1
    address        ip-10-10-10-10.us-west-1.compute.internal
    }

define host{
    use        linux-server,host-pnp
    host_name    XMPP_ec2-204-1-1-1
    alias        XMPP_ec2-204-1-1-1
    address        ip-10-30-30-30.us-west-1.compute.internal
    }

This is some after text

Any idea why the text before the first host definition gets snipped?
How does this work? Is 'snarfed' a function?

ctsgnb · October 26, 2010, 7:31pm

@Scruti

Nice one ! Lol

mglenney · October 26, 2010, 7:45pm

Ok. Now that's awesome. Multi-line record handling. Very smart. I also had no idea that setting RS to null would separate records at blank lines. Very cool. I put those blank lines in so it would look pleasing if I had to manually modify it. Guess I got lucky

I modified it to make sure it only matches that string if it's a defined "host_name":

awk '!/host_name\tXMPP_ec2-184-1-1-1/' RS= ORS="\n\n" infile

and it works perfectly.

Thanks!!

ctsgnb · October 26, 2010, 7:47pm

---------

mglenney · October 26, 2010, 7:51pm

And thanks to everyone who replied. I learned something new from each post which only makes me better. Thanks for that!

agama · October 27, 2010, 12:26am

Oh bother!! Nice catch. I hadn't noticed that it was dropping the host name when I tested it. The next should have been in curly braces -- irony here given your initial question -- with setting snarfed to null.

The leading lines get trashed because snarfed is null until the first hosts section and thus nothing is added to it. Slight modification to my original code captures leading lines too:

 awk -v remove=XMPP_ec2-184-2-2-2 '
        /define host/ {                 # start of new block; print last section
                if( snarfed )
                        printf( "%s", snarfed );   # buffer already has trailing newline so printf preferred

                snarfed = $0 "\n";              # start capture
                drop = 0;
                next;
        }

        /host_name/ {                   # check to see if unwanted target
                if( $2 == remove )      # if it is then
                {
                        snarfed = "";      # ditch what we had
                        drop= 1;   # prevent picking up more til next section
                        next;
                }
        }

        !drop {                  # if we are snarfing from current section
                snarfed = snarfed $0 "\n";      # add next line
                next;
        }
        END {                           # must do one last for the last buffered section
                if( snarfed )
                        print snarfed;
        }
' input-file >output-file

Snarfed is just a string of concatenated records. The idea is to buffer all text until a point is reached where the text in the buffer is "desired," and then written out. The test if( snarfed ) is shorthand for if( snarfed != "" ) . The trigger is the next host section; if there is something in the buffer when the next section is encountered it is written.

It's a bit more obvious with the change to use an additional variable to not ditch the leading lines.