hi all,
I was able to do a script to gather a few files and sort them.
here it is:
#!/usr/bin/ksh
ls *mainFile* |cut -c20-21 | sort > temp
set -A line_array
i=0
file_name='temp'
while read file_line
do
line_array=${file_line}
let i=${i}+1
# mainFile
gzcat *mainFile-dsa${file_line}* | awk '
BEGIN { FS = "," } ;
{if($1="") {mykey=$1} else {mykey=prev}}
{if(mykey != prev)
{print mykey",1,"NR","$0; prev=mykey}
else
{print prev",1,"NR","$0; prev=mykey}}
' > final
# line
gzcat *line-dsa${file_line}* | awk '
BEGIN { FS = "," } ;
{if($1="") {mykey=$1} else {mykey=prev}}
{if(mykey != prev)
{print mykey",2,"NR","$0; prev=mykey}
else
{print prev",2,"NR","$0; prev=mykey}}
' >> final
# ss
gzcat *ss-dsa${file_line}* | awk '
BEGIN { FS = "," } ;
{print $1",3,"NR","$0;}
' >> final
#bsginfo
gzcat *bsginfo-dsa${file_line}* | awk '
BEGIN { FS = "," } ;
{print $1",4,"NR","$0;}
' >> final
#gprs
gzcat *gprs-dsa${file_line}* | awk '
BEGIN { FS = "," } ;
{if($1="") {mykey=$1} else {mykey=prev}}
{if(mykey != prev)
{print mykey",5,"NR","$0; prev=mykey}
else
{print prev",5,"NR","$0; prev=mykey}}
function isnum(n) { return n ~ /^[0-9]+$/ }
' >> final
#odbdata
gzcat *odbdata-dsa${file_line}* | awk '
BEGIN { FS = "," } ;
{print $1",6,"NR","$0;}
' >> final
ls *mainFile* |cut -c0-8 | sort | read data
#sort -t "," +0 -2 -n final > final2
sort -t ',' +0 -1n +1 -2n +2 -3n final > final2
#sort final > final2
rm final
rm temp
gzip final2
mv final2.gz ${data}-final-dsa${file_line}.csv.gz
done < ${file_name}
my problems:
- when lines in each file exceeds a few millions "NR" instead of having the normal number, so i can apply sort, it gets in scientific notation and I'm not able to guarantee the lines order;
- the server as a I/0 charge very big so i should be able to do all the process only in memory (there are processors without charge and memory).
- can i receive the several gzcat input into only one awk script? or it is not possible?
- can i use pipe to send the previous result to the next instruction without writing to the "final" file?
- when it gets to sort instruction I/0 use goes from 30% to 100% and memory use stays the same, why?
can someone help me out on any of this question?
it is getting really hard for a newbie like me to get a solution my problems because a system that should take one day doing his operations is taking 5 days and i'm trying to get solutions in areas that i really don't understand for now.
Best regards,
Ricardo Tom�s