How to improve an script?

Gents.

I have 2 different scripts for the same purpose:

raw2csv_1

Script raw2csv_1 finish the process in less that 1 minute

raw2csv_2

Script raw2csv_2 finish the process in more that 6 minutes.

Can you please check if there is any option to improve the

raw2csv_2

. To finish the job quickly.

Attached.

Scripts
Input file
Output file generated.

Thanks for your support.

Most users will not download your attached files and extract.
1) Takes time and effort
2) Potentially damaging viruses

You may want to post the scripts, or sections that concern you, as pasted text instead.

script raw2csv_1

#!/bin/csh

cd /user/

#!/bin/csh -f
date
set filename = `echo $1 | sed -e 's/\./ /' | awk '{ print $1 }'`
awk ' RS = "# ===== " { if ( NR == 6) {print $0} }' $1 |\
awk ' RS = "]"  { if (NR >0) { print $0} }' | sed -e 's/
//g' |\
sed -e 's/:/ /g' | grep -v Report | sed -e '/^ *$/d' |\
sed -e 's/\t/ /g' | sed -e 's/"//g' | grep -v "====" |\
grep -v "                       [0-9]" |\
grep -v "^    Live_Seis_Channels" |\
 awk '{ for (i=1;i<=1;i++) { printf ("%s ",  $i)}; printf ","}' > $filename"_1".tmp
  sed -e 's/ ===== (.*) =====/#-#-#-#/g' $1 | awk ' RS = "##-#-#-#" { if ( NR > 1 ) {print $0} }' |\
  awk ' RS = "]"  { print $0 }' |\
  sed -e 's/:/ /g' | grep -v Report |\
  grep -v Report |\
  grep -v "^    Live_Seis_Channels" |\
  grep -v "^ *          " |\
  sed -e '/^ *$/d' | sed -e 's/\t/ /g' |\
  sed -e 's/"//g' |\
  awk '{\
         for (i=2;i<=NF;i++) {\
                                                  printf ("%s ", $i)\
                             };\
                             printf ("%s",",");\
                             } END { printf "\n" } ' | sed -e 's/# (ms)//g' | sed -e 's/# (msec)//g' | sed -e 's/#//g' | sed -e 's/
//g' > $filename"_2".tmp
echo ""

cat  $filename"_1".tmp  $filename"_2".tmp | sed -e  's/,===============  ,/\n/g'   > $filename.csv

date

rm -f "$filename"_1.tmp "$filename"_2.tmp

raw2csv_2

#!/bin/bash

cd /user/
date

filename=${1%.*}
sed 's/# ===== (.*) =====/#-#-#-#/g' $1 | \
awk '
NR>1 {
    gsub(/[\r\"\]]/,"")
    gsub(/[:\t]/," ")
    gsub(" *\n", "\n")
    gsub("\n *", "\n")
    gsub("\n#^[\n]*\n", "\n")
    gsub("\n\n+", "\n")
    printf "%s", $0
}' RS="#-#-#-#" | egrep -v '(====|Report|^[0-9]|^Live_Seis_Channels)' | \
awk -F"\n" -vRS="" 'NR>0{
  for(i=1;i<=NF;i++) {
     H=$i
     gsub(" .*","",H)
     gsub(H" *","",$i)
     gsub("# [(](ms|msec)[)]","",$i)
     gsub("#","",$i)
     V[NR]=V[NR]"  ,"$i
     if(NR==2)HD=HD" ,"H
  }
}
END{ print substr(HD,3)
  for(i=1;i<=NR;i++) print substr(V, 4)
}' > "${filename}.csv"
date

Try this; adapting/extending the header line will include the respective fields in your .csv file:

awk -F: 'BEGIN                  {HD="Version,Exploitation_Mode,Filter_Type,Aux_Nb_Trace,Seis_Nb_Trace,Total_Nb_Trace,Nb_Of_Dead_Seis_Channels,Nb_Of_Live_Seis_Channels,Dead_Seis_Channels"
                                 print HD
                                 HDCnt=split(HD,HDArr,",")
                                 NXTREC="Observer_Report" 
                                 HDCM=","HD","
                                }

                                {gsub (/[\t ]*|\*/, "", $1); sub (/^[\t ]*/, "", $2); sub (/[\t ]*$/,"", $2)}

         $1 == NXTREC && PR     {for (i=1; i<=HDCnt; i++) printf "%s,", RES[HDArr]
                                 printf "%d\n", NR
                                 delete RES
                                }
         $1 == NXTREC           {PR=1}
         HDCM ~ "," $1 ","      {RES[$1]=$2}


         END                    {for (i=1; i<=HDCnt; i++) printf "%s,", RES[HDArr]
                                 printf "\n"
                                }
        ' /tmp/342.raw

There's no error checking included nor the conversions like "msec" -> "" etc. that you have in your posted scripts.

---------- Post updated at 18:40 ---------- Previous update was at 15:48 ----------

I had to correct the script for data fileds that contain ":" like the Date entry:

awk -F: 'BEGIN                  {HD="Version,Exploitation_Mode,Filter_Type,Date,Aux_Nb_Trace,Seis_Nb_Trace,Total_Nb_Trace,Nb_Of_Dead_Seis_Chan
                                 print HD
                                 HDCnt=split(HD,HDArr,",")
                                 NXTREC="Observer_Report"
                                 HDCM=","HD","
                                }

                                {gsub (/[\t ]*|\*/, "", $1)}

         $1 == NXTREC && PR     {for (i=1; i<=HDCnt; i++) printf "%s,", RES[HDArr]
                                 printf "%d\n", NR
                                 delete RES
                                }
         $1 == NXTREC           {PR=1}
         HDCM ~ "," $1 ","      {T=$1; sub ($1 "[^:]*:[\t ]*", "", $0); sub (/[\t ]*$/, "", $0); RES[T]=$0}


         END                    {for (i=1; i<=HDCnt; i++) printf "%s,", RES[HDArr]
                                 printf "\n"
                                }
        ' OFS=":" /tmp/342

And, make sure you remove the DOS <CR> line separators!

1 Like

Hello RudiC,

Thanks I will try. it