Hi, I am converting a comma separated file to fixed field lenght and I am using that:
COLUMNS="25 24 67 26 39 63 20 34 35 14 397"
(
cat $indir/input_file.dat | \
$AWK -v columns="$COLUMNS" '
BEGIN {
FS=",";
OFS="";
split(columns, arr, " ");
}
{
for(i=1; i<=NF; i++)
printf("%-*s%c", arr, $i, (i==NF) ? RS : OFS)
}
') >> $outdir/output_file.dat
It is working fine, but most of the files are big and the performance is very slow. Any ideas how can I make it faster?
Thanks!
Step 1: Cut the i/o in half by eliminating the unnecessary cat.
Step 2: Eliminate the for-loop and replace it with a single call to printf.
Step 3: Don't use gawk unless you must. It's the slowest awk implementation.
Regards,
Alister
Actually it is:
AWK="/usr/xpg4/bin/awk" #extended awk for solaris
I wasn't asserting that you are using gawk. Your post did not specify the implementation, so I mentioned gawk's slow execution speed in case it was relevant.
Regards,
Alister
Assuming ASCII data, try
perl -F, -lape 'BEGIN { ($tmpl = shift) =~ s/(\d+)/A$1/g }
$_ = pack($tmpl, @F)' "$COLUMNS" "$indir/input_file.dat" > "$outdir/output_file.dat"
1 Like
Hi, thanks. I tried it, but unfortunately it is the same