Script Performance problem . urgent frnds

HI frnds

I have one flat with data and am loading the data into oracle table. While loading , rejected records are captured in log file. Now I want to read the log file and get the all rejected records and the reason for the rejection.
I developed the script . its finding 5000 rejected records. Am writing the records into one output flat file. But its taking 2 hours to for reading and populating the output file.

I have attached my script here. plz plz help frnd

i have attached my script and input sample file with 2 rejected records

Does your outer while loop ever finish? You never actually increment $i (or decrement $nr_of_occ).

YEs Am incrementing the things.. let i=$i+1 . i forget to copy that.

plz give me what will be the problem . what wll resolve my problem

These kind of subject line, is not useful at all.

Two commands are not needed, can be done with just awk

output to the file, at the end of the while block

 
hi matrixmadha,
actaull that allso part of the req. but hvnt include all those thing. i am facing problem only with writting out_file.txt. it take much time frnd :(
Plz .. i hope as u r a experienced , u can resolve my problem

A solution with awk :

# ora.sh

awk -v Outfile=ora.txt '

/^ORA-/ {
   split($0, f, "\"");
   Target_schema = f[2];
   Target_table  = f[4];
   Ora_error     = $0;
}

/Targ Rowid=/ { 
   Rejected_count++ ;
   Get_detail++;
   printf "#%d => %s\n", Rejected_count, Ora_error;
   next;
}

/^)/  && Get_detail {
   print Rejected_count,Values_detail > Outfile;
   Values_detail = "";
   Get_detail = 0;
   next;
}

Get_detail {
   sub(/\(.*\):/,"=");
   Values_detail = Values_detail $0 "|"
}

END {
   print "Total Lines in Session Log",FILENAME,"is :",NR;
   print "Total Nr of Rejected Records Captured in Sesslog is :",Rejected_count;
} 
' ora.log

Input file is ora.log and result file is ora.txt
The script takes 36 sec for 16384 rejected records

$ grep -c 'Targ Rowid' ora.log
16384
$ time ./ora.sh | tail -10

real   35.86
user   34.63
sys    0.71
#16377 => ORA-2222: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16378 => ORA-333: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16379 => ORA-2222: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16380 => ORA-333: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16381 => ORA-2222: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16382 => ORA-333: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16383 => ORA-2222: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
#16384 => ORA-333: value too large for column "STN"."STN_SAP_MATERIAL_ATTR"."SMA_STD_DESCR" (actual: 19, maximum: 18)
Total Lines in Session Log ora.log is : 1884160
Total Nr of Rejected Records Captured in Sesslog is : 16384
$ head -10 ora.txt | cut -c1-100
1 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE0222202"|SMA
2 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE1111"|SMA_MA
3 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE0222202"|SMA
4 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE1111"|SMA_MA
5 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE0222202"|SMA
6 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE1111"|SMA_MA
7 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE0222202"|SMA
8 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE1111"|SMA_MA
9 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE0222202"|SMA
10 SMA_FEED_RECEIVED_ID = "MASTER_MATERIAL_ATTR_222222"|SMA_SOURCE_SYSTEM_INSTANCE = "BRE1111"|SMA_M
$ 

jean-Pierre.

 
Dear jean-Pierre,
 
 Thanks for your reply. i will try the solution today.
before that , can u plz tell me the problem with code that i developed.
what causes the performance problem. plz tel me . 
how you could rectify that.?
whats the special with AWK utility ?
 

You opens files (for reading or writing) too many times :

total_lines=`wc -l $1 | awk '{print $1}'`

grep -n "ORA-" $1 |cut -d: -f1 >> ora_occ.txt
grep -n "Targ[[:space:]]Rowid" $1 | cut -d: -f1 >> targ_occ.txt
grep -n '^)$' $1 | cut -d: -f1 >> bracket_occ.txt

nr_of_occ=`wc -l targ_occ.txt | awk '{print $1}'`

   while [ "$i" -le "$nr_of_occ" ]
   do

      ora_line_nr=`sed -n \`echo $i\`p ora_occ.txt`
      targ_line_nr=`sed -n \`echo $i\`p  targ_occ.txt`
      bracket_line_nr=`sed -n \`echo $i\`p bracket_occ.txt`

      sed -n "$ora_line_nr p" $1 > ora_error_detail.txt

      while read ora_line
      do

      done <ora_error_detail.txt


      sed -n "$rec_start,$rec_end p " $1 > data_block.txt

      sed -e 's/(.*):/=/' data_block.txt |awk '{ line =line $0"|"} END { print line}' >>out_file.txt

   done

For an inputfile with 5000 rejected records (46 columns each) :

                          Open for  Open for
File            Records       Read     Write
--------------- -------   --------  --------
Inputfile             ?       5004
ora_occ.txt        5000       5000
targ_occ.txt       5000       5000
bracket_occ.txt    5000       5000
ora_error_deatail     1       5000      5000
data_block.txt       46       5000      5000 
--------------- -------   --------  --------
Total                        30004     10000

The awk solution reads the file only once, and doesn't use any other file.

Jean-Pierre.