Script to Extract the line from File with specified offset

Hi All,

I need to extract only XML details from large log file which may contain other unwanted junk details.

For example, our xml will be start as <OUTBOUND_MESSAGE .....> and ends with </OUTBOUND_MESSAGE>. I want to extract only lines between these start and end tag (Including these tags) from the big file (say about 50000 lines) and dump into another file or console output. Can you please provide help on this?

Thinakar

Try this bit of perl code:

#!/opt/perl/bin/perl

$ARGC=@ARGV;
if($ARGC!=1) {
        die "usage: ./test.pl <filename>\n";
}
$flag=0;
die "cannot open $ARGV[0]!" unless open FILEHNDL,$ARGV[0];
while($line=<FILEHNDL>) {
        if($line=~/?OUTBOUND_MESSAGE>/) {
                if(!$flag) {
                        $flag=1;
                }
        }
        elsif($flag) {
                print $line;
        }
}

Thanks for your reply. This code may not work because my XML format will vary after "<OUTBOUND_MESSAGE" as I have attached. When I check with egrep I get the following result. Now what I need is I need to read onlly the lines between 21047 - 21089, 22162 - 22201 and 22889 - 22926. I prefer Unix solution than perl as I don't have even basic knowledge in perl.

cnbas-clintgw-1a> egrep -in "<OUTBOUND_MESSAGE|</OUTBOUND_MESSAGE>" cramer2router_20060422131.log
21047:<OUTBOUND_MESSAGE xmlns="http://www.logica.com/eai/adapter/outbound/data/dbmc/CRA-SECTOR">
21089:</OUTBOUND_MESSAGE>
22162:<OUTBOUND_MESSAGE xmlns="http://www.logica.com/eai/adapter/outbound/data/dbmc/CRA-SWITC
22201:</OUTBOUND_MESSAGE>
22889:<OUTBOUND_MESSAGE xmlns="http://www.logica.com/eai/adapter/outbound/data/dbmc/CRA-CELL"
22926:</OUTBOUND_MESSAGE>

Thanks

I tried the following code using a varying <OUTBOUND_MESSAGE ...> input and it worked. Can you try it?

#!/usr/bin/perl

$ARGC=@ARGV;
if($ARGC!=1) {
        die "usage: ./test.pl <filename>\n";
}
$flag=0;
die "cannot open $ARGV[0]!" unless open FILEHNDL,$ARGV[0];
while($line=<FILEHNDL>) {
        if($line=~/\/?OUTBOUND_MESSAGE/) {
                if(!$flag) {
                        $flag=1;
                }
        }
        elsif($flag) {
                print $line;
        }
}

I ran a search on the site for for similar posts, and you have created another post for the same problem. This is a violation of rules. Please follow the rules and most importantly, have patience.

I remember to use awk to separate lines between pair of words.
Try man awk for this. I found this in man awk.

see if this helps you.