Perl script to merge cells in column1 which has same strings, for all sheets in a excel workbook

Perl script to merge cells

---------- Post updated at 12:59 AM ---------- Previous update was at 12:54 AM ----------

I am using below code to read files from a dir and print to excel.

open(my $in, '<', $file) or die "Could not open file: $!";
        my $rowCount = 0;
		my $colCount = 0;
        while(<$in>)
        {
			my @elements = split(',',$_);
			foreach my $el(@elements)
            {
				$wrksheet->write($rowCount,$colCount,$el);
				$colCount++;
            }
            $colCount = 0;
            $rowCount++;
        }

Show the input you have and the output you want.

We also would need to see the use statements as well as what you used to create the $wrksheet object.

Below is the complete code which works great converting all csv files(sorted based on column1) in dest directory to a single xls with multiple tabs.

use warnings;
use Spreadsheet::WriteExcel;
my $dest = '/home/dest';
my $workbook = Spreadsheet::WriteExcel->new("test.csv");
chdir $dest or die "no such directory: $!";
if ( -d $dest ) {
    opendir my $dh, $dest or die "can't open directory: $!";
	my @files = sort { $a cmp $b } readdir($dh);
	while ( my $file = shift @files ) {
        chomp $file;
        next if $file eq '.' or $file eq '..';
        my $sheetname = `basename $file | cut -d. -f1`;
        my $wrksheet = $workbook->add_worksheet($sheetname);
        open(my $in, '<', $file) or die "Could not open file: $!";
        my $rowCount = 0;
		my $colCount = 0;
		$colCount = 0;
        while(<$in>)
        {
			my @elements = split(',',$_);
			foreach my $el(@elements)
            {
				$wrksheet->write($rowCount,$colCount,$el);
				$colCount++;
            }
            $colCount = 0;
            $rowCount++;
        }
    }
}

---------- Post updated at 07:24 PM ---------- Previous update was at 07:23 PM ----------

The xls file(with multiple rows on each sheet) which i am generating with above code will look like below:

abs-pq        tfr23 	 12345
abs-pq	tfr24	         12843
abs-pq       tfr24           12435
abs-pqrst	rts09           19923|23141
abs-pqrst	rts10	         23456
tbs-pqrst	tfr25	          21938|22143
tbs-pqrst	zzz0z	          2414|5213|4306

column1 duplicates needs to be removed and merged as one cell.
column2 duplicates needs to be removed based on merged column1 only.( first three lines of below )
column3 should be unchanged.

need output like below(ignore ___ just filled gaps to show difference) :

abs-pq  tfr23 12345
______tfr24 12843
___________12435
abs-pqrst  rts09 19923|23141
_________rts10	23456
tbs-pqrst	tfr25	21938|22143
_________zzz0z	2414|5213|4306

any help in extending my original code to meet the requirement is greatly appreciated.

I cannot tell what the output you want is supposed to look like. Try posting your output again, this time with code tags,

```text
 stuff 
```
INPUT:
abs-pq	        tfr23	12345
abs-pq	        tfr24	12843
abs-pq	        tfr24   12435
abs-pqrst	tfr24   19923|23141
abs-pqrst	rts10	23456
tbs-pqrst	tfr25	21938|22143
tbs-pqrst	zzz0z	2414|5213|4306

OUTPUT REQUIRED:
abs-pq	        tfr23	  12345
	        tfr24     12843
			  12435
abs-pqrst	tfr24     19923|23141
        	rts10	  23456
tbs-pqrst	tfr25	  21938|22143
	        zzz0z	  2414|5213|4306

NOTE: cell B4 also have tfr24 but it should not be merged since column1 value is different.

Something like:

#! /usr/bin/perl

use strict;
use warnings;
use Spreadsheet::WriteExcel;

my $file = shift @ARGV;
my $workbook = Spreadsheet::WriteExcel->new($file) or die "can't create worksheet: $!";

my $dest = shift @ARGV;
chdir $dest or die "no such directory: $!";
opendir my $dh, '.' or die "can't open directory: $!";
my @files = sort grep { m{^[^.]} } readdir($dh);
close $dh; 

foreach my $file (@files) {
    open my $in, '<', $file or die "Could not open file: $!";

    $file =~ s{\..*$}{};
    my $worksheet   = $workbook->add_worksheet($file);

    my @prevRow = ();
    my $row = 0;

    while(<$in>) {
        my @currRow  = split(',', $_);
        my $col = 0;

        while ($col < @currRow) {
            last if @prevRow < $col || $currRow[$col] ne $prevRow[$col];
            $worksheet->write($row, $col, '');
            $col++;
        }

        while ($col < @currRow) {
            $worksheet->write($row, $col, $currRow[$col]);
            $col++;
        }

        @prevRow = @currRow;
        $row++;
    }
}

Invoked with:

scriptname workbookfile sourcedir...

Some notes:

  • if ( -d $dest ) { was not needed, the previous chdir would have failed if $dest was not a directory.
  • chomp $file is not needed, a trailing newline is an acceptable (if not appreciated) character in a directory entry
  • Be reluctant to use man system (linux) calls in a perlscript. In this case, my $sheetname = `basename $file | cut -d. -f1`; was replaced by $file =~ s{\..*$}{}; as $file is already a "basename" and you were just trimming off everything after the first ".".

I also want the empty cells to be merged with the respective string.

---------- Post updated at 09:57 PM ---------- Previous update was at 09:23 PM ----------

As of now it is removing duplicates as expected but I also want the empty cells to be merged to their respective string.

JB,

Can you provide an example of what you mean by "empty cells to be merged to their respective string".

  • DL
As of now the excel first column looks like this:
----------|
abs-pq      |
----------|
----------|
----------|
abs-pqrst |
----------|
----------|
tbs-pqrst |
----------|
----------|

I want output like this on both col1 and 2, what should i add to the code:
----------|
abs-pq      |
          |
          |
----------|
abs-pqrst |
          |
----------|
tbs-pqrst |
          |
----------|

---------- Post updated at 06:35 PM ---------- Previous update was at 08:12 AM ----------

have used below in the code, looks like it is working but when i open excel i am getting some data may have lost error. Am i doing it in right way?

my $format = $workbook->add_format(
                                    border  => 6,
                                    #bold    => 1,
                                    color   => 'black',
                                    bg_color => 0x32,
                                    align   => 'left',
                                    valign => 'vcenter',
                                    #center_across => 1,
                                    );
my $format1_merged = $workbook->add_format(
                                    border  => 6,
                                    );
while ($col < @currRow) {
            last if @prevRow < $col || $currRow[$col] ne $prevRow[$col];
           #### $worksheet->write($row, $col, '',$format1);
            $worksheet->merge_range($row, $col, $row-1, $col, '', $format1_merged);
            $col++;
        }
        while ($col < @currRow) {
            $worksheet->write($row, $col, $currRow[$col],$format);
            $col++;
        }

Anyone have anyother better solution to merge cells?

#! /usr/bin/perl

use strict;
use warnings;
use Spreadsheet::WriteExcel;

my $file = shift @ARGV;
my $workbook = Spreadsheet::WriteExcel->new($file) or die "can't create worksheet: $!";

my $dest = shift @ARGV;
chdir $dest or die "no such directory: $!";
opendir my $dh, '.' or die "can't open directory: $!";
my @files = sort grep { m{^[^.]} } readdir($dh);
close $dh; 

foreach my $file (@files) {
    open my $in, '<', $file or die "Could not open file: $!";

    $file =~ s{\..*$}{};

    my @prevRow = ();
    my $row     = 0;
    my @sheet   = undef;

    while(<$in>) {
        chomp;
        my @currRow  = split(',', $_);
        my $col = 0;

        while ($col < @currRow) {
            last if @prevRow <= $col || $currRow[$col] ne $prevRow[$col];
        $sheet[$col][$row] = undef;
            $col++;
        }

        while ($col < @currRow) {
        $sheet[$col][$row] = $currRow[$col];
            $col++;
        }

        @prevRow = @currRow;
        $row++;
    }

    close $in;

    my $worksheet = $workbook->add_worksheet($file);

    foreach my $col (0 .. $#sheet) {
        my $R     = $sheet[$col];
        my $nRows = scalar @$R;
        my $span  = 1;

        $row = 0;

        while ($row < $nRows) {
            unless (defined $R->[$row]) {
                $span++;
                next;
            }

            if (1 < $span) {
                $worksheet->merge_range($row - $span, $col, $row - 1, $col);
                $span = 1;
            }

            $worksheet->write($row, $col, $R->[$row]);
        } continue {
            $row++;
        }

        if (1 < $span) {
            $worksheet->merge_range($row - $span, $col, $row, $col);
        }
    }
}

Use the merge_range method to merge rows.

@sheet is populated with the worksheets data so that undefined cells can be counted for the merge_range method.

Note that $worksheet->write($row, $col, undef) writes a blank cell while $worksheet->write($row, $col, '') writes a cell with an zero length string - ISBLANK() would return FALSE.

1 Like