how to count a word in a file

dear all,

i have a requirement to count the errors and display from a file.
eg. file1.txt

sjdgfjdgfgd ora-0001 sdjgfydh sdukgh7 23
sjdgfjdgfgd ora-0002 sdjgfydhsf34 ew 34v
sjdgfjdgfgd ora-0008 sdjgfydh asdf asdfas 
sjdgfjdgfgd ora-0001 sdjgfydhjkbs ui873
sjdgfjdgfgd ora-0004 sdjgfydh 2876gfen 
sjdgfjdgfgd ora-0002 sdjgfydhj uewiuriue 324987 

the output would be :

Error Code : ORA-0001  Count : 2
Error Code : ORA-0002  Count : 2
Error Code : ORA-0004  Count : 1
Error Code : ORA-0008  Count : 1
 

I wrote a prog. like below and is working fine. would like to know is there are any simple way to write the prog. New to unix so not sure of other ways.
Thanks in advance.

#!/bin/sh
echo "Enter filename..."
read name
cd /test/unix
cat $name | while read line
do
echo "$line" > tmpj
cat "tmpj" | egrep -c ora- > tmpk
if [ `cat tmpk` -gt 0 ]
then 
cat tmpj | sed 's/.*\(ora-.....\).*/\1/' >> tmpl
fi
done
rm tmpj
rm tmpk
for var1 in `cat tmpl`
do
echo "$var1" > tmpj
cat tmpl | egrep -c `cat tmpj` > tmpk
if [ `cat tmpk` -gt 0 ]
then
echo "Error Code : "$var1"  Count : `cat tmpk`"
sed "/$var1/d" tmpl > tmpm
mv tmpm tmpl
fi
done
rm tmpj
rm tmpk
rm tmpl
nawk '
{
   a[$2]++
}
END {
   for (i in a)
     print "Error Code : " i " Count : " a
}
' file1.txt

---------- Post updated at 11:12 AM ---------- Previous update was at 11:10 AM ----------

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags

```text
 and 
```

by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

Hi gerh99,

Thanks, excellent code.
The file which I had is an error message file and the ora errors are not always the 2nd col./occurence. ora error may appear anywhere in the line or wont appear also.
Apologies for the inconvenience.
eg.

sjdgfjdgfgd sdjgfydh sdukgh7 23 ora-0001 
sjdgfj dgf g  d ora-0002 sdjgfydhsf34 ew 34v
sjdg fjdgf gd ora-0008 sdjgfydh asdf asdfas 
sjdgfj dgf gd ora-0001 sdjgfydhjkbs ui873
sjdgfjdg fgd sdjgfydh 2876gfen 
sj dgfjd gfgd ora-0002 sdjgfydhj uewiuriue 324987
nawk '
/ora-[0-9]/ {
    for(i=1;i<=NF;i++)
       if ( $i ~ /ora-[0-9]/ ) {a[$i]++;break}
}
END {
   for (i in a)
     print "Error Code : " i " Count : " a
}
' file1.txt

HI Gersh99,

Thank you again for the prompt reply and it working fine and working much faster to my code(which I mentioned in the 1st post).
Nawk is something new to me, searched in the beginners unix book (by wrox) nothing much available.

Please correct me if Im worng :
The code is taking 'space' as the space separator.
and if the file is somewhat like this :

sjhgfjhgdfs ora-0001 kjhsf 098j 97h
suiy23vb jhf8 ora-0001
jkhsdkj 98798 error:ora-0001 uif987

then am getting the output not as ora-0001 : 3 but as

ora-0001 : 2
error:ora-0001 : 1

is there anyway i can extract only words starting with ora-

Please use CODE tags, am having difficulty in reading your sample input.

nawk '
/ora-[0-9]/ {
    for(i=1;i<=NF;i++)
       if ( $i ~ /ora-[0-9]/ ) { a[substr($i,index($i,"ora-"))]++; break}
}
END {
   for (i in a)
     print "Error Code : " i " Count : " a
}
' file1.txt

Hi vergs99,

sorry for coming back to you.....but can you please check with the below file.
the ':' sign at the end of the ora error is getting picked up.

Input File Name: 17419_1763520347_20060718042116.sms
Info - OrderId:1763520347 - timestamp:1153189280
Error 000 - Messages from oracle:ora-00001: unique constraint (MOBILE.PK_MAILORDER) violated
Error 000 - Error Internal :9001 (Oracle error): ora-00001
Info - File /usr/mobileway/router/error/default/20060718042120-1763520347-0-0-17419-0-9001.err written
Input File Name: 20060718040944-1693906198-0-0-14502-0-9098.sms
Info - OrderId:1693906198 - timestamp:1153189744
Error 000 - Messages from oracle:ora-01438: value larger than specified precision allows for this colu
Error 000 - Error Internal :9098 (Oracle error): 1438
Info - File /usr/mobileway/router/error/resendable/20060718042904-1693906198-0-0-14502-0-9098.err written
iInput File Name: 17419_1763520347_20060718042116.sms
Info - OrderId:1763520347 - timestamp:1153189280
Error 000 - Messages from oracle:ora-00001: unique constraint (MOBILE.PK_MAILORDER) violated
Error 000 - Error Internal :9001 (Oracle error): ora-00001
Info - File /usr/mobileway/router/error/default/20060718042120-1763520347-0-0-17419-0-9001.err written
Input File Name: 20060718040944-1693906198-0-0-14502-0-9098.sms
Info - OrderId:1693906198 - timestamp:1153189744
Error 000 - Messages from oracle:ora-02935: value larger than specified precision allows for this colu
Error 000 - Error Internal :9098 (Oracle error): 1438
Info - File /usr/mobileway/router/error/resendable/20060718042904-1693906198-0-0-14502-0-9098.err written
Input File Name: 17419_1763520347_20060718042116.sms
Info - OrderId:1763520347 - timestamp:1153189280
Error 000 - Messages from oracle:ora-00002: unique constraint (MOBILE.PK_MAILORDER) violated
Error 000 - Error Internal :9001 (Oracle error): ora-00001
Info - File /usr/mobileway/router/error/default/20060718042120-1763520347-0-0-17419-0-9001.err written
Input File Name: 20060718040944-1693906198-0-0-14502-0-9098.sms
Info - OrderId:1693906198 - timestamp:1153189744
Error 000 - Messages from oracle:ora-02935: value larger than specified precision allows for this colu
Error 000 - Error Internal :9098 (Oracle error): 1438
Info - File /usr/mobileway/router/error/resendable/20060718042904-1693906198-0-0-14502-0-9098.err written

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags

```text
 and 
```

by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

---------- Post updated at 07:51 AM ---------- Previous update was at 07:47 AM ----------

nawk '
/ora-[0-9]/ {
    for(i=1;i<=NF;i++)
       if ( match($i,/ora-[0-9]+/)) { a[substr($i,RSTART,RLENGTH)]++; break}
}
END {
   for (i in a)
     print "Error Code : " i " Count : " a
}
' file1.txt

Excellent gersh99, you are brilliant. Its working fine and working fast as well.
Can you please explain your code for the benefit of users like me or plz suggest some books where in I can gain some theoritical knowledge about shell prog using nawk.

---------- Post updated at 05:48 PM ---------- Previous update was at 05:33 PM ----------

Sorry guys for not using code tags, make sure next time i use it.