Ikon
December 9, 2008, 5:12pm
1
<SUMMARY filecount_excluded="0" dirbytes_sent="3367893" dirbytes_hashcache="13275664" ..and so on..>
<session numthreads="1" type="avtarbackup" ndispatchers="1" ..and so on..><host numprocs="4"
speed="900" osuser="root" name="ashsux01" memory="24545" /><build time="11:04:53" msgversion="13-10"
appname="avtar" ..and so on../><dirstats numfiles="193129" numbytes="1461473417216" numdirs="0" />
</session><errorsummary exitcode="0" errors="0" warnings="0" fatals="0" />
</SUMMARY>
I would like to output the following:
SUMMARY_filecount="0"
SUMMARY_dirbytes="3367893"
...
SUMMARY_session_numthreads="1"
SUMMARY_session_type="avtarbackup"
...
SUMMARY_session_build_time="11:04:53"
SUMMARY_session_build_msgversion="13-10"
...
SUMMARY_session_dirstats_numfiles="193129"
SUMMARY_session_dirstats_numbytes="1461473417216"
...
SUMMARY_errorsummary_exitcode="0"
SUMMARY_errorsummary_errors="0"
...
There could be any number of tags within tags and any number of variables.
I tried doing some searching but wasnt sure what exactally to search for.
joeyg
December 9, 2008, 5:26pm
2
> cat file103 | sed "s/<\/session/~&/g" | tr "~" "\n"
Which would separate data onto each own line,
then grep for the lines you want to see?
[Sorry I couldn't test, but your sample data did not have the appropriate tags or session markers for me to truly analyze.]
Ikon
December 9, 2008, 5:32pm
3
Here is an actual snip from the log:
<SUMMARY filecount_excluded="0" dirbytes_sent="3367893" dirbytes_hashcache="13275664"><session numthreads="1"
type="avtarbackup" ndispatchers="1"><host numprocs="4" speed="900" osuser="root" name="ashsux01" memory="24545" />
<build time="11:04:53" msgversion="13-10" appname="avtar"/><dirstats numfiles="193129" numbytes="1461473417216"
numdirs="0" /></session><errorsummary exitcode="0" errors="0" warnings="0" fatals="0" /></SUMMARY>
The problem im running into is <SUMMARY has its own vars then within SUMMARY it has SESSION and within SESSION it has its vars and then DIRSTATS, BUILD.
like this:
<SUMMARY VARS...>
<SESSION VARS...>
<BUILD VARS.../>
<DIRSTATS VARS.../>
</SESSION>
<ERRORSUMMARY VARS...\>
</SUMMARY>
The actual log file has many others within summary and some not called summary they might be called something completely different than SUMMARY or SESSION.
then i think you have to do it with two or more awk
i tried some thing but not getting right hope you work more on it
awk '/^<SUMMARY/{print $0}' file|awk -F"[=_]" 'BEGIN{RS=" "}$0!="<SUMMARY"{print "SUMMARY_"$1"="$3}'
awk '/^<session/{print $0}' file|awk -F"[= ]" 'BEGIN{RS=" "}$0!="<session"{print "SUMMARY_session_"$1"="$2}'
Ikon
December 9, 2008, 6:03pm
5
Those are pretty goot starting points. They are pretty close.
I will be working on it all day tomorrow, when I get a working script i will be sure to share.
ikon:
Those are pretty goot starting points. They are pretty close.
I will be working on it all day tomorrow, when I get a working script i will be sure to share.
will be looking forward for the answer..
regards,
vidya
Ikon
December 11, 2008, 4:35pm
7
Well I decided to write it in perl...
Im having some problems:
If you compair the log with the output its skipping the first item, ie CUSTOMER_join, ADDRESS_street, TODAY_date.....
This is not the actual log file it will be reading it just a quick on I put together. The actual log if faily large.
# cat testfile.log
<CUSTOMER join="1/1/2008" last="12/20/2008">
<NAME name="John" lname="Smith">
<ADDRESS street="main" number="123" city="Orlando" state="Florida" zip="12345" />
</DATA>
</CUSTOMER>
<TODAY date="12/11/2008" time="12:12:12" />
# perl readlog.pl testfile.log
CUSTOMER_last: 12/20/2008
NAME_lname: Smith
ADDRESS_number: 123
ADDRESS_city: Orlando
ADDRESS_state: Florida
ADDRESS_zip: 12345
TODAY_time: 12:12:12
# cat readlog.pl
$filename = $ARGV[0];
open FILE, $filename or die $!;
while (<FILE>) {
push(@fields, defined($1) ? $1:$3)
while m/([^<>]+)/g;
}
close(FILE);
$head="";
foreach (@fields) {
if ($_ =~ /^([A-Za-z0-9]+) /) {
$line = $_;
($header,$data) = split(/ /, $line, 2);
if ( $head == "" ) {
$head = $header;
} else {
$head = $head."_".$header;
}
@subs = split(/" /,$data);
for($i = 0; $i < @subs; $i++) {
($str, $strdata) = split (/=/,$subs[$i]);
$strdata =~ s/^"//;
$strdata =~ s/"$//;
$head =~ s/_{2,}/_/;
if ($str !~ /\//) {
print $head."_".$str.": ".$strdata."\n";
}
}
if ($data =~ /\/$/) {
@h = split(/_/,$head);
$max = @h - 1;
$head =~ s/$h[$max]// ;
}
} else {
if ($_ =~ /^\//) {
@h = split(/_/,$head);
$max = @h - 1;
$head =~ s/$h[$max]// ;
}
}
}
Ikon
December 11, 2008, 4:46pm
8
I started $i at 1 instead of 0.
its fixed now.
Ikon
December 11, 2008, 5:52pm
9
Found one more error..
here is the final:
# cat test.log
<CUSTOMER join="1/1/2008" last="12/20/2008"><NAME name="John" lname="Smith">
<ADDRESS street="main" number="123" city="Orlando" state="Florida" zip="12345" />
</DATA></CUSTOMER><TODAY date="12/11/2008" time="12:12:12" />
# cat readlog.pl
$filename = $ARGV[0];
open FILE, $filename or die $!;
while (<FILE>) {
push(@fields, defined($1) ? $1:$3)
#while m/"([^"\\]*(\\.[^"\\]*)*)"|([^ ]+)/g;
while m/([^<>]+)/g;
}
close(FILE);
$head="";
foreach (@fields) {
if ($_ =~ /^([A-Za-z0-9]+) /) {
$line = $_;
($header,$data) = split(/ /, $line, 2);
if ( length ($head) < 3 ) {
$head = $header;
} else {
$head = $head."_".$header;
}
@subs = split(/" /,$data);
#print "==$data==";
for($i = 0; $i < @subs; $i++) {
($str, $strdata) = split (/=/,$subs[$i]);
$strdata =~ s/^"//;
$strdata =~ s/"$//;
$head =~ s/_{2,}/_/;
if ($str !~ /\//) {
print $head."_".$str."='".$strdata."'\n";
}
}
if ($data =~ /\/$/) {
@h = split(/_/,$head);
$max = @h - 1;
$head =~ s/$h[$max]// ;
}
} else {
if ($_ =~ /^\//) {
@h = split(/_/,$head);
$max = @h - 1;
$head =~ s/$h[$max]// ;
}
}
}
# perl readlog.pl test.log
CUSTOMER_join='1/1/2008'
CUSTOMER_last='12/20/2008'
CUSTOMER_NAME_name='John'
CUSTOMER_NAME_lname='Smith'
CUSTOMER_NAME_ADDRESS_street='main'
CUSTOMER_NAME_ADDRESS_number='123'
CUSTOMER_NAME_ADDRESS_city='Orlando'
CUSTOMER_NAME_ADDRESS_state='Florida'
CUSTOMER_NAME_ADDRESS_zip='12345'
TODAY_date='12/11/2008'
TODAY_time='12:12:12'
hi Below perl may help you some
#! /usr/bin/perl
undef $/;
open FH,"<a.txt";
$str=<FH>;
$str=~tr/\n//d;
while($str=~m/<(.*?)>/){
my @arr=split(" ",$1);
if($#arr==0){
$pre=substr($pre,0,rindex($pre,"_"));
$str=$';
next;
}
$pre.=($pre)?"_".$arr[0]:$arr[0];
#print "\n",$pre,"----->\n\n";
for($i=1;$i<=$#arr;$i++){
if(index($arr[$i],"/")!=-1){
$arr[$i]=substr($arr[$i],0,index($arr[$i],"/"));
}
print $pre."_".$arr[$i]."\n";
}
if (index($1,"/")!=-1){
$pre=substr($pre,0,rindex($pre,"_"));
}
$str=$';
print "\n";
}
close FH;
Ikon
December 12, 2008, 9:37am
12
Some good ideas in there but it doesnt work correctly:
# cat test.log
<CUSTOMER join="1/1/2008" last="12/20/2008"><NAME name="John" lname="Smith">
<ADDRESS street="main" number="123" city="Orlando" state="Florida" zip="12345" />
</DATA></CUSTOMER><TODAY date="12/11/2008" time="12:12:12" />
CUSTOMER_join="1
CUSTOMER_last="12
CUSTOME_NAME_name="John"
CUSTOME_NAME_lname="Smith"
CUSTOME_NAME_ADDRESS_street="main"
CUSTOME_NAME_ADDRESS_number="123"
CUSTOME_NAME_ADDRESS_city="Orlando"
CUSTOME_NAME_ADDRESS_state="Florida"
CUSTOME_NAME_ADDRESS_zip="12345"
CUSTOME_NAME_ADDRESS_
CUSTOM_TODAY_date="12
CUSTOM_TODAY_time="12:12:12"
CUSTOM_TODAY_