I have a format of file, I need to seperate the file the different colored data to separate files like file1, file2 and file 3. Also replace '*' with '|' and if there is any '~' replace with '\n' with the help of unix script
ISA*00**00**02*CN*ZZ*RECEIVER ID*060628*0035*U*00201*000000612*0*P*>
GS*IM*CN*APPLICATION RECEIVER ID*20060628*0035*612*X*004010
ST*210*612001
B3*B*28061234*102141*PP**20060628*208360****CNRU
N1*PR*PAYER NAME*25*772305B
N3*123 NEWBRIDGE ROAD .
N4*ETOBICOKE*ON
N1*CN*CONSIGNEE NAME*25*772305
N3*1800 INKSTER BLVD
N4*WINNIPEG SYMING YAR*MB*R2X2Z5
N1*SF*SHIP FROM NAME*25*772305
N3*123 NEWBRIDGE RD
N4*ETOBICOKE*ON
SE*33*612001
ST*210*612001
B3*B*28061234*102141*PP**20060628*208360****CNRU
N1*PR*PAYER NAME*25*772305B
N3*123 NEWBRIDGE ROAD .
N4*ETOBICOKE*ON
N3*123 NEWBRIDGE ROAD .
N4*ETOBICOKE*ON
N3*123 NEWBRIDGE ROAD .
N4*ETOBICOKE*ON
N1*CN*CONSIGNEE NAME*25*772305
N3*1800 INKSTER BLVD
N4*WINNIPEG SYMING YAR*MB*R2X2Z5
N1*SF*SHIP FROM NAME*25*772305
N3*123 NEWBRIDGE RD
N4*ETOBICOKE*ON
SE*33*612001
ST*210*612001
B3*B*28061234*102141*PP**20060628*208360****CNRU
N1*PR*PAYER NAME*25*772305B
N3*123 NEWBRIDGE ROAD .
N4*ETOBICOKE*ON
N1*CN*CONSIGNEE NAME*25*772305
N3*1800 INKSTER BLVD
N4*WINNIPEG SYMING YAR*MB*R2X2Z5
N1*SF*SHIP FROM NAME*25*772305
N3*1800 INKSTER BLVD
N4*WINNIPEG SYMING YAR*MB*R2X2Z5
N1*SF*SHIP FROM NAME*25*772305
N3*123 NEWBRIDGE RD
N4*ETOBICOKE*ON
SE*33*612001
GE*1*612
IEA*1*000000612
Hi, what have you tried so far and where are you stuck?
Hi, I am new to UNIX scripting, till now I have written a script which will read the file and it will replace the '*' with '|'. But not sure how to written the content into different files as mentioned in my post.
What scripting language are you using, can you post the script?
So you want to extract records starting with the ST
and ending with the SE
perl -e '
$file_counter=0;
$in_record=0;
while(<>){
s/\*/|/g;
s/~/\n/g;
if(/^ST/){
$in_record=1;
open (FILE, ">" , "file_$file_counter");
}
print FILE $_ if $in_record;
if (/^SE/){
close FILE;
$in_record=0;
$file_counter++;
}
}' tmp.dat
awk version:
awk -F\* '$1=="ST",$1=="SE"{if($1=="ST"){close(f);f="file" ++i} gsub(/~/,RS); $1=$1; print>f}' OFS=\| infile
--
On Solaris use /usr/xpg4/bin/awk rather than awk
2 Likes
Hi [COLOR=black][FONT=Verdana]Skrynesaver,
I tried you script, and in the while condition I gave the filename with complete path as
perl -e '
$file_counter=0;
$in_record=0;
while("$HOME/Bharath/SampleX12"){
s/\*/|/g;
s/~/\n/g;
if(/^ST/){
$in_record=1;
open (FILE, ">" , "file_$file_counter");
}
print FILE $_ if $in_record;
if (/^SE/){
close FILE;
$in_record=0;
$file_counter++;
}
}' tmp.dat
When executing it no errors, no result and also its doesnt come back to command prompt. Please advice.
leave the script as it was and replace tmp.dat with the path to the file you wish to extract the records from
Skrynesaver,
Wow, wonderful!!!!!!! It worked Thanks.
Now here you hardcoded the file name, suppose ie we have to modify the script which picks the file from a particular directory with particular file name pattern one by one, what changes needs to be done?
Thanks,
Bharath.
change it into a Perl script that iterates over the supplied arguments eg:
#!/usr/bin/perl
use strict;
use warnings;
for my $filename (@ARGV){
my $file_counter=0;
my $in_record=0;
open (my $file "<" $filename);
while($file){
s/\*/|/g;
s/~/\n/g;
if(/^ST/){
$in_record=1;
open (my $outfile, ">" , "$filename.$file_counter");
}
print $outfile $_ if $in_record;
if (/^SE/){
close $outfile;
$in_record=0;
$file_counter++;
}
}
close $file;
}
and call it with the list of files you wish to split as arguments
1 Like
Skrynesaver,
The script that you gave is throwing error when adding any unix commands. If I need to change it unix script what should I do it so that I can also include the basic unix commands like echo etc.
for echo
use
print to run commands use the quote executable syntax qx($command) if you wish to capture the output of a command you can assign the output of the quoted executable eg.
perl -e '$output=qx(ls $ENV{HOME}/tmp);print $output'
tmp.pl
tmp.dat
tmp.sh