Performance Tuning

Hi All,

In last one week, i have posted many questions in this portal. At last i am succeeded to make my 1st unix script.

following are 2 points where my script is taking tooooo long.

  1. Print the total number of records excluding header & footer. I have found that awk 'END{print NR - 2}' <filename> will give exact result. But my file has 520 columns seperated with ^~^. while using this command, error is 'too long...... record'. so i have used like this.

a=` cut -d~ -f1 NAM2008101601.OUT |awk 'END{print NR - 2}' `

But this is taking too long.

  1. i need sum of amount column. So first i have cut that column & paste it in a file. Then i have calculated sum. But it is also taking long time.

cut -d~ -f27 $FILE_NAME | cut -c2-23 > amount
tot_val=`awk '{a+=$0}END{printf "%.5f\n",a}' amount`

Appriciate if someone can help me to get alternatives which speed up this process.

Thnaks
Amit

For the point 1, since you are assuming that there will be always header and footer, not sure why you need to read whole file NAM2008101601.OUT to get record count.

Can you try using file size and record length to get record count, subtract 2 to get data records count?

1) Take file size from ls -l NAM2008101601.OUT
2) Get the record length, if it is not constant, then use wc to get record length
3) Total records = file size/record length
4) Subtract 2 from step 3, to take header/trailer counts.

Hope this might work.

What version of awk are you using? GNU awk (gawk) has no predefined field count or record length limit.