I am struggling with many huge XML files with lots of Account details including at least one Membership tag, in that Membership tag one xml tag was missed that is MembershipIdentifier:
(There are many Account tags with at least one Membership tag are there in each file)
How can i find which AccountIdentifier missed MembershipIdentifier and if possible i need to replace with default MembershipIdentifier like PB00000000123456
So far i have tried with this for finding missed MembershipIdentifiers, but it didn't work:
As i am dealing with large files(each file contains more that 10K Accounts and file count is more than 1000), this script just appending MI tag and displaying the output on the screen but not to the actual files, i am not sure how to add to actual files.
Also i would need to know which file and which AccountIdentifier(At least file name) missed MembershipIdentifier tag.
Could you please help me to get this done?
Thank in advance...
Making the wild assumption that the exec family of functions on your system can handle more than 1000 filenames in an argument list, the following should do what you want:
Call this script with a list of files to be processed as operands. If your system can't handle an arg list that long, use xargs to invoke this script multiple times with subsets of the argument list.
This was written and tested using the Korn shell, but will work with any shell that understands basic POSIX shell parameter expansions (including ash , bash , dash , ksh , and zsh ) but will not work with a legacy Bourne shell and will not work with shells based on csh syntax.
And, as always, if you want to try this on a Solaris system, change awk to /usr/xpg4/bin/awk or nawk .
This snippet saves any .xml file as .xml.rebuilt and it adds a default MembershipIndentifier. It logs the details in a file named rebuilt.log in the current directory, reporting the file name, the line number and the account missing the tag. Updates will be printed to your screen.
Save as VasuKukkapalli.pl and run as perl VasuKukkapalli.pl file1.xml file2.xml file3.xml ...
or perl VasuKukkapalli.pl *.xml
#!/usr/bin/perl
use strict;
use warnings;
sub account_id
{
my $account_line = shift;
my ($id) = $account_line =~ /<AccountIdentifier>(\d+)</;
return $id;
}
sub writefile
{
my $filename = shift || die;
print "Creating $filename\n";
open my $fh, '>', $filename || die "Could not create $filename: $!\n";
return $fh;
}
my @account = ();
my $membership =
"<MembershipIdentifier>PB00000000123456789</MembershipIdentifier>\n";
my $current_file = $ARGV[0];
my $log = writefile("rebuilt.log");
my $tmp = writefile("$current_file.rebuilt");
while(<>){
if($current_file ne $ARGV){
close $tmp;
$current_file = $ARGV;
$tmp = writefile("$current_file.rebuilt");
$. = 1;
}
push @account, [] if /<Account>/;
if(exists $account[0]){
push @{$account[0]}, $_;
push @{$account[1]}, $.;
}
else{
print $tmp "$_";
}
if(/<\/Account>/){
if(!(@{$account[0]}[8] =~ /<MembershipIdentifier>/)){
my ($spaces) = @{$account[0]}[7] =~ /(^\s+)/;
splice @{$account[0]}, 8, 0, "$spaces$membership";
my $id = account_id(@{$account[0]}[1]);
print $log "File $ARGV: ",
"Line @{$account[1]}[8]: ",
"Account $id missing MembershipIdentifier\n";
}
print $tmp "@{$account[0]}";
@account = ();
}
}
print "Your files have been saved with the extension .rebuilt\n";
print "For details of missing MembershipIdentifier, please,",
"look into rebuilt.log\n";
close $log;
close $tmp;
The file rebuilt.log will have something similar to:
File v2.xml: Line 26: Account 23123 missing MembershipIdentifier
File v2.xml: Line 41: Account 23125 missing MembershipIdentifier
File v3.xml: Line 26: Account 23123 missing MembershipIdentifier
File v3.xml: Line 41: Account 23125 missing MembershipIdentifier