Splitting the file

I have a format of file, I need to seperate the file the different colored data to separate files like file1, file2 and file 3. Also replace '*' with '|' and if there is any '~' replace with '\n' with the help of unix script

ISA*00**00**02*CN*ZZ*RECEIVER ID*060628*0035*U*00201*000000612*0*P*> 
GS*IM*CN*APPLICATION RECEIVER ID*20060628*0035*612*X*004010 
ST*210*612001 
B3*B*28061234*102141*PP**20060628*208360****CNRU 
N1*PR*PAYER NAME*25*772305B 
N3*123 NEWBRIDGE ROAD . 
N4*ETOBICOKE*ON 
N1*CN*CONSIGNEE NAME*25*772305 
N3*1800 INKSTER BLVD 
N4*WINNIPEG SYMING YAR*MB*R2X2Z5 
N1*SF*SHIP FROM NAME*25*772305 
N3*123 NEWBRIDGE RD 
N4*ETOBICOKE*ON 
SE*33*612001 
ST*210*612001 
B3*B*28061234*102141*PP**20060628*208360****CNRU 
N1*PR*PAYER NAME*25*772305B 
N3*123 NEWBRIDGE ROAD . 
N4*ETOBICOKE*ON 
N3*123 NEWBRIDGE ROAD . 
N4*ETOBICOKE*ON 
N3*123 NEWBRIDGE ROAD . 
N4*ETOBICOKE*ON 
N1*CN*CONSIGNEE NAME*25*772305 
N3*1800 INKSTER BLVD 
N4*WINNIPEG SYMING YAR*MB*R2X2Z5 
N1*SF*SHIP FROM NAME*25*772305 
N3*123 NEWBRIDGE RD 
N4*ETOBICOKE*ON 
SE*33*612001 
ST*210*612001 
B3*B*28061234*102141*PP**20060628*208360****CNRU 
N1*PR*PAYER NAME*25*772305B 
N3*123 NEWBRIDGE ROAD . 
N4*ETOBICOKE*ON 
N1*CN*CONSIGNEE NAME*25*772305 
N3*1800 INKSTER BLVD 
N4*WINNIPEG SYMING YAR*MB*R2X2Z5 
N1*SF*SHIP FROM NAME*25*772305 
N3*1800 INKSTER BLVD 
N4*WINNIPEG SYMING YAR*MB*R2X2Z5 
N1*SF*SHIP FROM NAME*25*772305 
N3*123 NEWBRIDGE RD 
N4*ETOBICOKE*ON 
SE*33*612001 
GE*1*612 
IEA*1*000000612

Hi, what have you tried so far and where are you stuck?

Hi, I am new to UNIX scripting, till now I have written a script which will read the file and it will replace the '*' with '|'. But not sure how to written the content into different files as mentioned in my post.

What scripting language are you using, can you post the script?

So you want to extract records starting with the ST and ending with the SE

perl -e '
$file_counter=0;
$in_record=0;
while(<>){
  s/\*/|/g;
  s/~/\n/g;
  if(/^ST/){
    $in_record=1;
    open (FILE, ">" , "file_$file_counter");
  }
  print FILE $_ if $in_record;
  if (/^SE/){
    close FILE; 
    $in_record=0;
    $file_counter++;
  }
}' tmp.dat

awk version:

awk -F\* '$1=="ST",$1=="SE"{if($1=="ST"){close(f);f="file" ++i} gsub(/~/,RS); $1=$1; print>f}' OFS=\| infile

--
On Solaris use /usr/xpg4/bin/awk rather than awk

2 Likes

Hi [COLOR=black][FONT=Verdana]Skrynesaver,
I tried you script, and in the while condition I gave the filename with complete path as

perl -e '
$file_counter=0;
$in_record=0;
while("$HOME/Bharath/SampleX12"){
s/\*/|/g;
s/~/\n/g;
if(/^ST/){
$in_record=1;
open (FILE, ">" , "file_$file_counter");
}
print FILE $_ if $in_record;
if (/^SE/){
close FILE;
$in_record=0;
$file_counter++;
}
}' tmp.dat

When executing it no errors, no result and also its doesnt come back to command prompt. Please advice.

leave the script as it was and replace tmp.dat with the path to the file you wish to extract the records from

Skrynesaver,

Wow, wonderful!!!!!!! It worked :slight_smile: :slight_smile: :slight_smile: Thanks.

Now here you hardcoded the file name, suppose ie we have to modify the script which picks the file from a particular directory with particular file name pattern one by one, what changes needs to be done?

Thanks,
Bharath.

change it into a Perl script that iterates over the supplied arguments eg:

#!/usr/bin/perl

use strict;
use warnings;
for my $filename (@ARGV){
  my $file_counter=0; 
  my $in_record=0; 
  open (my $file "<" $filename);
  while($file){ 
    s/\*/|/g; 
    s/~/\n/g; 
    if(/^ST/){ 
      $in_record=1; 
      open (my $outfile, ">" , "$filename.$file_counter"); 
    } 
    print $outfile $_ if $in_record; 
    if (/^SE/){ 
      close $outfile; 
      $in_record=0; 
      $file_counter++; 
    } 
  }
  close $file;
}

and call it with the list of files you wish to split as arguments

1 Like

Skrynesaver,
The script that you gave is throwing error when adding any unix commands. If I need to change it unix script what should I do it so that I can also include the basic unix commands like echo etc.

for echo use

print to run commands use the quote executable syntax qx($command) if you wish to capture the output of a command you can assign the output of the quoted executable eg.

perl -e '$output=qx(ls $ENV{HOME}/tmp);print $output'
tmp.pl
tmp.dat
tmp.sh