Perl script to parse output and print it comma separated

Juha · March 17, 2011, 11:04pm

I need to arrange output of SQL query into a comma separated format and I'm struggling with processing the output...

The output is something like this:

<Attribute1 name><x amount of white spaces><Atribute value>
<Attribute2 name><x amount of white spaces><Atribute value>
<Attribute3 name><x amount of white spaces><Atribute value>
<Attribute4 name><x amount of white spaces><Atribute value>
<Attribute5 name><x amount of white spaces><Atribute value>

There are always 5 Attribute/Value pairs that makes one record.
The Attribute names are not always in same order.

So it could be(example where I would have only two records 5 lines(A,B,C,D,E) each):

I would need to parse this output into the following format:
A,B,C,D,E
A,B,C,D,E
....
but using only the value, so the above 2 records would be printed as:
1,2,3,4,5
5,4,3,2,1

What I've tried so far is as below:

sub query {
        open (OUTPUT, "su - oracle -cf 'source .profile; sqlplus $user/$passwd \@$sql_file'|") || die "Can open SQL\n";
        while (<OUTPUT>) {
                chop;
                ($rubbish,$value) = split(/\s+/);
                if (/^A/){
                        $list[0] = $value;
                } elsif (/B/){
                        $list[1] = $value;
                } elsif ( /C/ ){
                        $list[2] = "$value";
                } elsif ( /D/ ){
                        $list[3] = "$value";
                } elsif ( /E/ ){
                        $list[4] = "$value";
                } else { next;}
                push ( @array, \@list );
        }
        close(OUTPUT);
}
sub printSubs {
        $" = ",";
        for $a (@array){
                print "@$a\n";
        }
}

But that does not work.. it just prints out the last record many times.. :wall:
So would someone have any ideas how to fill the array?

pravin27 · March 18, 2011, 2:31am

#!/usr/bin/perl

open (FH,"Outputfile") or die "FAIL- $!\n";
while (<FH>) {
chomp;
@fields=split(/\s+/);
push(@{$fields[0]},$fields[1]);
$hash{$fields[0]}=\@{$fields[0]};
}

for($i=0;$i<($./5);$i++) {
foreach (sort(keys(%hash))) {
printf "%s ",$hash{$_}[$i];
}
print "\n";
}

durden_tyler · March 18, 2011, 11:22am

juha:

...
What I've tried so far is as below:

sub query {
   open (OUTPUT, "su - oracle -cf 'source .profile; sqlplus $user/$passwd \@$sql_file'|") || die "Can open SQL\n";
   while (<OUTPUT>) {
   chop;
   ($rubbish,$value) = split(/\s+/);
   if (/^A/){
   $list[0] = $value;
   } elsif (/B/){
   $list[1] = $value;
   } elsif ( /C/ ){
   $list[2] = "$value";
   } elsif ( /D/ ){
   $list[3] = "$value";
   } elsif ( /E/ ){
   $list[4] = "$value";
   } else { next;}
   push ( @array, \@list );
   }
   close(OUTPUT);
}
sub printSubs {
   $" = ",";
   for $a (@array){
   print "@$a\n";
   }
}

But that does not work.. it just prints out the last record many times..
So would someone have any ideas how to fill the array?

Your code is unnecessarily complicated.
The part in blue color fills up the indexes 0..4 of the array @list.
All you have to do is - print the array @list as soon as you've processed 5 records.

The part in red color is just overkill.

Here's the modified portion of the code -

while (<OUTPUT>) {
 chomp;
 ($rubbish,$value) = split(/\s+/);
 if (/^A/){
   $list[0] = $value;
 } elsif (/B/){
   $list[1] = $value;
 } elsif (/C/){
   $list[2] = $value;
 } elsif (/D/){
   $list[3] = $value;
 } elsif (/E/){
   $list[4] = $value;
 } else { next }
 print join(",", @list),"\n" if ($.%5 == 0); 
}

Here's the modified code in action -

$
$
$ # display the content of the data file "f2"
$
$ cat f2
A   1
B   2
C  3
D   4
E  5
B  4
C    3
A  5
E  1
D  2
$
$
$ # display the content of the Perl program "f2.pl"
$
$ cat -n f2.pl
    1  #!perl -w
    2  my $rubbish;
    3  $file="f2";
    4  open (OUTPUT, "<", $file) || die "Can't open $file\n";
    5  while (<OUTPUT>) {
    6    chomp;
    7    ($rubbish,$value) = split(/\s+/);
    8    if (/^A/){
    9      $list[0] = $value;
   10    } elsif (/B/){
   11      $list[1] = $value;
   12    } elsif (/C/){
   13      $list[2] = $value;
   14    } elsif (/D/){
   15      $list[3] = $value;
   16    } elsif (/E/){
   17      $list[4] = $value;
   18    } else { next }
   19    print join(",", @list),"\n" if ($.%5 == 0);
   20  }
   21  close(OUTPUT);
   22
$
$
$ # Run the Perl program
$
$ perl f2.pl
1,2,3,4,5
5,4,3,2,1
$
$

As for why your code doesn't work - you keep pushing the reference to the array @list in the array @array. When the array @list changes, the reference to @list now points to the "current" array.

After 1st record is processed, @list = (1)             and \@list = pointer to @list i.e. it points to (1)
After 2nd record is processed, @list = (1, 2)          and \@list = pointer to @list i.e. it points to (1, 2)
After 3rd record is processed, @list = (1, 2, 3)       and \@list = pointer to @list i.e. it points to (1, 2, 3)
After 4th record is processed, @list = (1, 2, 3, 4)    and \@list = pointer to @list i.e. it points to (1, 2, 3, 4)
After 5th record is processed, @list = (1, 2, 3, 4, 5) and \@list = pointer to @list i.e. it points to (1, 2, 3, 4, 5)

So by the time you've processed the first five records, @list = (1,2,3,4,5) and \@list = pointer to (1,2,3,4,5) and since first 5 elements of @array are all \@list, they all point to (1,2,3,4,5) as well.

From record 6 onwards, you overwrite the array @list. (You should print it before overwriting.) Records 6 through 10 do the same thing again.

So at the end, you have 10 elements in @array, all of which are references to the latest array i.e. (5,4,3,2,1).

HTH,
tyler_durden

Further thoughts: you could change your Oracle SQL query to return the comma-delimited string of values directly. Thereby you can avoid all this processing in Perl.