shell script takes long time to complete

Hi all,

 I wrote this shell script to validate filed numbers for input file. But it take forever to complete validation on a file. The average speed is like 9mins/MB.
  
 Can anyone tell me how to improve the performance of a shell script?

Thanks

It would be great if you could post sample input / output data and if possible your script :slight_smile:

matrixmadhan,

I don't have the script right here with me, but I can brief you how my script looks like.

    #starts with couple of constants for the file
    function1 ...
    function2 ...
    function3
    {
       function4
    }
    
    function4...
    
    while time < 00:00:00
    do 
       function1
       if [ $? -eq 0 ]
       then
           for loop 
           do 
             function2...
             function4...
             ./call_another_script
           done
       fi
     done

Will this help to determine the cause?

There could be something in those functions that is taking too much of a time. For example, some cut or some grep or an invocation of some external tools. Also the script, call_another_script could be the culprit.

Unless you can show what those functions are, it is hard to pinpoint the exact cause.

Put some "date" commands in your script and you might be able to find out where the delay is and focus on that part.

I would start by setting the debug flag in the shell. It might be obvious just from that what operation is taking the time, without knowing exactly what you are doing in the functions it is not really possible for anyone to answer.

the part that take the most of the time is the following code.

  
function line_count
   {
       COUNT=`echo $1 | awk -F\| '{print NF}'`
       if [ "$COUNT" != "$2" ]
       then
          error_log "File $FN: Validation failed at line   $LINENUM. Expected $2, getting $COUNT"
          return 5
       fi
   }

   function validate_line
   {
        if [ "$1" = "$FIRST_LEVEL_HEAD" ]
        then
              line_count "$2" $FIRST_LEVEL_COUNT
              return $?
        elif [ "$1" = "$SECOND_LEVEL_HEAD" ]
        then
              line_count "$2" $SECOND_LEVEL_COUNT
              return $?
        else
              error_log "File $FN: Line $LINENUM head is not regconised"
              return 5
        fi
   }


   function validate_file
   {
      trace_log "Start to validate $FN..."
      LINENUM=0
      ERROR=0
       while read LINE
       do
            LINENUM=`expr $LINENUM + 1`
            LINE_HEAD=`echo $LINE | awk -F\| '{print $1}'`
            validate_line $LINE_HEAD "$LINE"
            if [ ! $? -eq 0 ]
            then
               ERROR=1
            fi
       done < $1
       if [ ! $ERROR -eq 0 ]
       then
           return 7
       fi
    }


    validate_file $FILE

Any suggestion ?

What does the input file format look like?

The input file format will look like this:

00|AA|BB|CC|DD|
01|EE|FF|GG|
02|HH|KK|LL|
00|AA|BB|CC|DD|
01|...
02|...

And all I need to do is to validate if the number of fields for each line matches the expected number. So there won't any output. just True / False

Forgot to mention that all the fields in each line separated by "|".

function validate_file
{
    ERROR=0
    OIFS="$IFS"
    IFS="|"
    while read LINE
    do
        set -- $LINE
        LINE_HEAD="$1"
        shift
        case $LINE_HEAD in
            ${FIRST_LEVEL_COUNT}|${SECOND_LEVEL_COUNT})
                if [ $# -ne $LINE_HEAD ]
                then
                    ERROR=1
                    break 2
                fi
                ;;
            *)
                ERROR=1
                break 2
                ;;
        esac
    done < $1
    IFS="$IFS"
    if [ $ERROR -ne 0 ]
    then
        return 7
    fi
}

validate_file $FILE

reborg,

thanks for the reply. But it's a bit hard for me to follow the code. Is it possible to give me a brief explaination?

thanks

Another approach is to do it all in awk, e.g...

awk -F\| '($1=="00" && NF!=5) || ($1=="01" && NF!=4) {exit 7}' file1

Which could be read as:
IF the first field is equal to "00" and the number of fields is not equal to 5
OR the first field is equal to "01" and the number of fields is not equal to 4
THEN stop scanning the file and exit with error level 7.