I'm new from UNIX scripting. Please help.
I have about 10,000 files from the $ROOTDIR/scp/inbox/string1 directory to compare with the 50 files from /$ROOTDIR/output/tma/pnt/bad/string1/ directory and it takes about 2 hours plus to complete the for loop. Is there a better way to re-write the script below so it can process faster.
Thank you in advance.
pntcnt1=`ls -l /$ROOTDIR/scp/inbox/string1 | grep 'PNT.*' | awk '/^-/ {print $9}' | wc -l`
if [[ $pntcnt1 -gt 0 ]] then
for gfile in `ls -1 /$ROOTDIR/scp/inbox/string1/PNT.2*`
do
gline=`sed '1q' $gfile`
x=`echo "$gline" | awk '{ print substr( $0, 38, 9 ) }'`
for bfile in `ls -1 /$ROOTDIR/output/tma/pnt/bad/string1/PNT.2*`
do
bline=`sed '1q' $bfile`
y=`echo "$bline" | awk '{ print substr( $0, 38, 9 ) }'`
if [ "$x" -eq "$y" ]
then
echo "file moved $gfile"
mv $gfile /$ROOTDIR/output/tma/pnt/bad/string1
break
fi
done
done
fi
right?
and gline & bline should start at position 38 instead of 37, am I correct?
Thanks
---------- Post updated at 04:05 PM ---------- Previous update was at 01:27 PM ----------
I got and error when execute the script below x=${gline:38:9}: 0403-011 The specified substitution is not valid for this command.
I want to compare file's value from $ROOTDIR/scp/inbox/string1 directory at the position 38 with 9 charaters long to the file in /$ROOTDIR/output/tma/pnt/bad/string1/ directory with the same position and length. It looks like the code below is written to compare just for the file name.
#!/usr/bin/ksh
pntcnt1=$( ls -l /$ROOTDIR/scp/inbox/string1 | grep 'PNT.*' | awk '/^-/ {print $9}' | wc -l )
[ $pntcnt1 -lt 1 ] && exit 0
for gfile in /$ROOTDIR/scp/inbox/string1/PNT.2*
do
read gline < $gfile
x=${gline:37:9}
# get filename other method:
#x=${gline##*/}
for bfile in /$ROOTDIR/output/tma/pnt/bad/string1/PNT.2*
do
read bline < $bfile
y=${bline:37:9}
#y=${bline##*/}
if [ "$x" -eq "$y" ]
then
echo "file moved $gfile"
mv -f $gfile /$ROOTDIR/output/tma/pnt/bad/string1
break
fi
done
done
I wish I can do that myself, my UNIX admin won't allow me to do that. Anyway, I'd like to
Check and compare the 10,000 pnt files contains single record from the /$ROOTDIR/scp/inbox/string1 directory against 39 bad pnt files from the /$ROOTDIR/output/tma/pnt/bad/string1 directory based on the fam_id column value start at position 38 to 47 from the record below. Here is an example of the record from the file in both directories:
PNT0220060503081122003700100000091049000005629001005146417001407712SFirstname Lastname
If fam_id is matched then move current file from the /$ROOTDIR/scp/inbox/string1 directory into the /$ROOTDIR/output/tma/pnt/bad/string1 directory.
If not then continue the normal process
The below code is worked but it took 2 plus hours to complete the comparison process. Please advice if there is a better way to re-write or improve the comparison process to make it run faster and better. Thanks
pntcnt1=`ls -l /$ROOTDIR/scp/inbox/string1 | grep 'PNT.*' | awk '/^-/ {print $9}' | wc -l`
if [[ $pntcnt1 -gt 0 ]] then
for gfile in `ls -1 /$ROOTDIR/scp/inbox/string1/PNT.2*`
do
gline=`sed '1q' $gfile`
x=`echo "$gline" | awk '{ print substr( $0, 38, 9 ) }'`
for bfile in `ls -1 /$ROOTDIR/output/tma/pnt/bad/string1/PNT.2*`
do
bline=`sed '1q' $bfile`
y=`echo "$bline" | awk '{ print substr( $0, 38, 9 ) }'`
if [ "$x" -eq "$y" ]
then
echo "file moved $gfile"
mv $gfile /$ROOTDIR/output/tma/pnt/bad/string1
break
fi
done
done
fi