parallel processing

hi i am preparing a set of batches for a set of files sequentially
There is a folder /xyz where all the files reside
now all the files starting with
01 - will be appended for one below other to form a batch batch01
then all the files starting with
02 - will be appended for one below other to form a batch batch02
then
03 - will be appended for one below other to form a batch batch03
then
04 - will be appended for one below other to form a batch batch04
..

and so on, now this is taking a lot of time while processing.
How can i improve the performance say include some type of parallel processing
to minimize the time.
Please Advice

Post your script so we can see what you are doing wrong.

Please find the script below

name00=ABC00`date +"%y%m%d%H"`
for i in 0[0]*.[tT][xX][tT]
do
cat ${i} >> ${name00}            
done

name01=ABC01`date +"%y%m%d%H"`
for i in 0[1]*.[tT][xX][tT]
do
cat ${i} >> ${name01}            
done
 

name02=ABC02`date +"%y%m%d%H"`
for i in 0[2]*.[tT][xX][tT]
do
cat ${i} >> ${name02}            
done

and so on

Your code would probably benefit from the usage of "find" and the elimination of repetitive tasks like expanding the date over and over again:

chDate="$(date +"%y%m%d%H")"
typeset -Z2 iCounter=0

while [ $iCounter -le 99 ] ; do
     find /your/directory -name "${iCounter}*[tT][xX][tT]" -print > "ABC${iCounter}${chDate}"
     (( iCounter += 1 ))
done

I put an arbitrary end at 99 for demonstration purposes, adapt the script to what you really need. If this is not fast enough try backgrounding the "find"s by adding a " &" at the end of the line starting with "find".

I hope this helps.

bakunin

First, what type of file system is your /xyz directory? What is the underlying hardware? How busy is it when you're running your script?

If your hardware is already maxed out, it's already maxed out and parallel processing won't help. In fact, it could even slow it down further and you'll likely get more disk contention.

name00=ABC00`date +"%Y%m%d%h"`
cat 00*.[tT][xX][tT] > "$name00"

name01=ABC01`date +"%Y%m%d%h"`
cat 01*.[tT][xX][tT] > "$name01"

name02=ABC02`date +"%Y%m%d%h"`
cat 02*.[tT][xX][tT] > "$name02"

Or:

for n in 01 02 03 04 ...
do
  name=ABC$n`date +"%Y%m%d%h"`
  cat "$n"*.[tT][xX][tT] > "$name"
done

Your code would probably benefit from the usage of "find" and the elimination of repetitive tasks like expanding the date over and over again:

chDate="$(date +"%y%m%d%H")"
typeset -Z2 iCounter=0

while [ $iCounter -le 99 ] ; do
     find /your/directory -name "${iCounter}*.[tT][xX][tT]" -print > "ABC${iCounter}${chDate}"
     (( iCounter += 1 ))
done

I put an arbitrary end at 99 for demonstration purposes, adapt the script to what you really need. If this is not fast enough try backgrounding the "find"s by adding a " &" at the end of the line starting with "find".

I hope this helps.

bakunin

Hi bakunin and Johnson

Trying the above aproaches had resulted in faster prcessing.

Thanks a lot.