Splitting a file and creating new files using Perl script

Hi All,

I am new to Scripting language.
I want to split a file and create several subfiles using Perl script.

Example :
File format :

Sourcename		ID			Date	               Nbr	
SU				IMYFDJ		9/17/2012	   5552159976555	
SU				BWZMIG		9/14/2012	   1952257857887	
AR				PEHQDF		11/26/2012	   0442045903874	
AR				ELIOAA		12/31/2012	   0442121341024	

I want to split this files by Sourcename.
As this has two source name, so it will be splitting into two.

File 1 (after splitting) :

Sourcename		ID				Date	       Nbr	
SU				IMYFDJ			9/17/2012	5552159976555	
SU				BWZMIG			9/14/2012	1952257857887	

File 2 (after splitting) :

Sourcename		ID				Date	       Nbr	
AR				PEHQDF			11/26/2012	0442045903874	
AR				ELIOAA			12/31/2012	0442121341024

Thanks a lot in advance,
Deepak

Hi Deepak,

 
perl -F"\s+" -anle 'system("echo \"$_\" >> $F[0]'.txt'");' filename

Hi Pravin,

I ran that code given by you, it provided me four splitting files. Two files without below headings.

Sourcename ID Date Nbr 

Another file with only this heading and one blank text file.

But I need this heading in the two splitted files.

Can you help me on this ?

-Deepak

A Perl program doing the job
file025 (tabs have been replaces by spaces):

Sourcename ID Date Nbr
SU IMYFDJ 9/17/2012  5552159976555
SU BWZMIG 9/14/2012   1952257857887
AR PEHQDF 11/26/2012   0442045903874
AR ELIOAA 12/31/2012   0442121341024

Program :

#!/usr/bin/perl -w
use strict;

my $cur_dir = $ENV{PWD};
my $filename = "$cur_dir/$ARGV[0]";
my ($record,$header,$i,@fields,%files);

open(FILEIN,"<$filename") or die"open: $!";
while( defined( $record = <FILEIN> ) ) {
  chomp $record;
  $header=$record if (!defined $header);

  $i++;
  if($i > 1) {
    @fields=split(/ /,$record);
    # if file do not exits, create it and write header
    if(! exists( $files{$fields[0]}) ) {
      $files{$fields[0]} = "$fields[0].file";
      open (FILEOUT, "> /tmp/$files{$fields[0]}") ||
        die "FATAL: cannot open \"$files{$fields[0]}\" for writing: $!\n";
      print FILEOUT "$header\n";
      close(FILEOUT);
    }
    # append record to file
    open (FILEOUT, ">> /tmp/$files{$fields[0]}") ||
      die "FATAL: cannot open \"$files{$fields[0]}\" for writing: $!\n";
    print FILEOUT "$record\n";
    close(FILEOUT);
  }
}
close(FILEIN);

Outputs :

/tmp %cat SU.file
Sourcename ID Date Nbr
SU IMYFDJ 9/17/2012  5552159976555
SU BWZMIG 9/14/2012   1952257857887
/tmp %cat AR.file
Sourcename ID Date Nbr
AR PEHQDF 11/26/2012   0442045903874
AR ELIOAA 12/31/2012   0442121341024

Hi Fundix,

Thanks a lot for your help..its working fine :).

In the file if I am using delimiter '|' instead of space,
It is giving output file name as single letter.

Example :

     $i++;
  if($i > 1) {
    @fields=split(/|/,$record);
    # if file do not exits, create it and write header
    if(! exists( $files{$fields[0]}) ) {
      $files{$fields[0]} = "$fields[0].file";
      open (FILEOUT, "> /tmp/$files{$fields[0]}") ||
        die "FATAL: cannot open \"$files{$fields[0]}\" for writing: $!\n";
      print FILEOUT "$header\n";
      close(FILEOUT);

Result :

Filename :

S.file

Sometimes I have Sourcename more than 2 letters also.

-Deepak

Hi,

guessing your file looks like :

Sourcename ID Date Nbr
SU|IMYFDJ|9/17/2012|5552159976555
SU|BWZMIG|9/14/2012|1952257857887
AR|PEHQDF|11/26/2012|0442045903874
AR|ELIOAA|12/31/2012|0442121341024

you must escape the pipe "|" in the splitting line that way :

@fields=split(/\|/,$record);

Hi Fundix,

Thanks a lot.. u are awesome :).. Its working fine.

I have another requirement that is to zip the splitted files after creating each of them.

Can that be done in perl script too ?

-Deepak

yes, you can use the system command :

system("your zip command here");

Hi Fundix,

I tried to use this below command.But it is not working.

system( 'zip', $file ) 

I need to zip all the splitted files after creating each of them.
So can you please answer, where i can add the zip part in the main code.

-Deepak

Hi there, hereafter the code with the zip output, working good on an Aix system.
Each file name created is stored in an array (@fileLst) and at the end of the program, all files of the array are zipped into an unique archive :
:

#!/usr/bin/perl -w
use strict;

my $cur_dir = $ENV{PWD};
my $filename = "$cur_dir/$ARGV[0]";
my ($record,$header,$i,@fields,%files,$key,@fileLst);

open(FILEIN,"<$filename") or die"open: $!";
while( defined( $record = <FILEIN> ) ) {
  chomp $record;
  $header=$record if (!defined $header);

  $i++;
  if($i > 1) {
    @fields=split(/\|/,$record);
    if(! exists( $files{$fields[0]}) ) {
      $files{$fields[0]} = "$fields[0].file";
      open (FILEOUT, "> /tmp/$files{$fields[0]}") ||
        die "FATAL: cannot open \"$files{$fields[0]}\" for writing: $!\n";
      push(@fileLst,"/tmp/$files{$fields[0]}");
      print FILEOUT "$header\n";
      close(FILEOUT);
    }
    open (FILEOUT, ">> /tmp/$files{$fields[0]}") ||
      die "FATAL: cannot open \"$files{$fields[0]}\" for writing: $!\n";
    print FILEOUT "$record\n";
    close(FILEOUT);
  }
}
close(FILEIN);

foreach (@fileLst) {
  system("zip /tmp/zipfile.zip $_");
}
%./file025.pl file026
  adding: tmp/SU.file (deflated 10%)
  adding: tmp/AR.file (deflated 9%)
dpi@%unzip -l zipfile.zip
Archive:  zipfile.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
       91  06-25-13 08:51   tmp/SU.file
       93  06-25-13 08:51   tmp/AR.file
 --------                   -------
      184                   2 files

Hope this helps

Better to use Archive::Zip module, than call the system command.

That module is not installed on my Aix system (and am not root ;))

Maybe you can give us an updated vs of the program using that module.
It will be very interesting.

Thank You

Hi Fundix,

Thanks a lot for your help.

Actually I need the individual splitted files to be zipped separately.

Example :

  SU.zip
  AR.zip

I am getting error like

Global symbol "@fileLst" requires explicit package name

-Deepak

the array must be declared when using Strict :

my ($record,$header,$i,@fields,%files,$key,@fileLst);