It's been a few years since college when I did stuff like this all the time. Can someone help me figure out how to best tackle this problem? I need to parse a file full of entries that look like this:
I only want the data that's in each of these fields, so I want PGR, not symbol="PGR"
I can use sed to strip away everything but the data I need -- which I've done -- but the data remains in its original order, not the one I'm looking for: (Note, the issuerName field is in Brackets for visual purposes).
PGR CA VEF [PROAGROI-7 B] PGR CA VEF
What's the best way to re-order the above line according to my CSV needs? Or is there a different approach I should be taking entirely?
Instead of doing it in shell (sed/awk), better use any XML parser. May be you can write a simple script in Perl or any scripting languages which support XML parsing.
Thanks! I'll need a bit of time to work with this, but I prefer using the right tool for the job and this looks like it will help me with a few next steps I was planning anyways.
Anyway, below boring code can also address the requirement.
You amy try it
open FH,"<a.txt";
my @arr=<FH>;
close FH;
foreach(@arr){
while(m/ (.*?=".*?")/){
my $str=$1;
$_=$';
$hash{$1}=$2 if ($str=~m/(.*)="(.*)"/);
}
print $hash{issuerName},"|",$hash{symbol},"|",$hash{exch},"|",$hash{curr},"|",$hash{Csymbol},"|",$hash{Cexch},"|",$hash{Ccurr},"\n";
}
I have tried the above code for the following xml
<Account id='xxxxxxxxxxxxxx' name='xxxx' creator='abcd' createDate='110908'
lastModifier='abcd' resource='DataMart' accountId='F100206'
userid='F100206' situation='active' discoveredSituation='CONFIRMED' accountExists='true'>
<MemberObjectGroups>
<ObjectRef type='ObjectGroup' id='#ID#Top' name='Top'/>
</MemberObjectGroups>
</Account>
open FH,"<a.txt";
my @arr=<FH>;
close FH;
foreach(@arr){
while(m/ (.*?=".?")/){
my $str=$1;
$_=$';
$hash{$1}=$2 if ($str=~m/(.*)="(.)"/);
}
print $hash{accountId},"|",$hash{createDate},"|",$hash{userid},"|",$hash{creator},"|",$hash{accountExists},"|",$hash{resource},"|",$hash{lastModifier},"\n";
}
use XML::Simple;
use Data::Dumper;
my $config = XMLin("file");
print Dumper($config);
my $issuername = $config->{issuerName};
my $symbol = $config->{issuerName};
my $exch = $config->{exch};
my $curr = $config->{curr};
my $csymbol = $config->{Csymbol};
my $cexch = $config->{Cexch};
my $ccurr = $config->{Ccurr};
@line = ($symbol,$exch,$curr,$csymbol,$cexch,$ccurr );
print join(",",@line);
output
# ./test.pl
PROAGROI-7 B,CA,VEF,PGR,CA,VEF
or the "hard way"
while (<>){
if ( /<eq/ .. /\/>/ ){
@list = split /\"\s/ ,$_;
foreach my $k (@list){
print "$k\n";
# get your values;
}
}
}