UNIX shell script to search a string in a file

Hi folks,

I am new for shell script, I hope somebody to help me to write shell script
My requirement is below steps

  1. I have apache access.log i.e located in /var/log/httpd/
    Ex.
127.0.0.1 - - [03/Jun/2014:11:50:15 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=.Audio\x99+310 HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:50:41 +0530] "GET /configurator/ HTTP/1.0" 404 988
127.0.0.1 - - [03/Jun/2014:11:51:06 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=.Audio+655+DSP HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:51:11 +0530] "GET /uk/solutions/home-based-worker/work-shifting.pdf HTTP/1.0" 404 1093
127.0.0.1 - - [03/Jun/2014:11:51:25 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=ML18 HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:52:06 +0530] "GET /pl/solutions/successful-collaboration/ HTTP/1.0" 404 1063
  1. Need to be first column is afte GET i.e we treated as name Ex: /common and /pl ...etc
    if column match also needs to check Http status code i.e 404 or 200

for example match common and 404 needs to be push to one file we call it error.log
if not above condition remaining all goes to another file

some thing to /pl/, /UK/, /configurarator/ all goes to different files with respective of their cname folders.

really appreciate if help me.

Regards

couldn't understand the requirement clearly
Can you please be specific about what needs to go where and what is 'cname'?

Hi Srini
Thanks for your prompt response

Generally we called Cname after GET from access logs
ex : GET /common/

Simply say when common(ex:/common/) and status code(ex. 404) match needs to be send to one file remaining wills goes to another file.

Let me know if any more details.

Regards,
Reddy.

So you are going to end up with numerous error files in the form of:
{cname}_{statuscode}.err

Such as:
common_404.err

Is that what you mean?

---------- Post updated at 05:56 AM ---------- Previous update was at 05:44 AM ----------

If that is what you want, you can try the below code. It will read each line from your log file and insert it into its own error log file accordingly. It will build these error log files in the same directory as where you run the script.

#!/bin/ksh
while read -r line
do
echo ${line} >> $(echo ${line} | awk -F"[/ ]" '{print $10"_"$(NF-1)".err"}') 
done < logfile 

Hi ,

Thanks for you it seem to be working, but
1) I need to send status code 200 to 399 to all send to access log
2) I need to send status code 400 to 599 send to error log

Please help me on this

Regards,
Reddy

So if the input file is considered as space separated, do you want each message sent to a file based on column 8 (they are all 404 in your example)

You could use less processing with:-

#!/bin/ksh
while read a b c d e f g h rest
do
   echo "$a $b $c $d $e $f $g $h $rest" >> /path/to/file-$h
done < logfile

If you want to base it on the first bit of column 7 (the pl, uk or common bit) you need another tweak to split up to column:-

#!/bin/ksh
while read a b c d e f g h rest
do
   ref="${g#/}"          # Trim off leading /
   ref="${ref%%/*}"      # Trim off everything after the first /
   echo "$a $b $c $d $e $f $g $h $rest" >> /path/to/file-$ref
done < logfile

Of course, you could use both if you with, so your output file will become /path/to/file-$ref-$h or whatever.

If you have codes in column 8 that are errors for one file and others for just loggings, you could:-

#!/bin/ksh
while read a b c d e f g h rest
do
   case $h in
      2??|3??) type=valid ;;
      4??|5??) type=error ;;
      *)       type=other ;;
   esac   
   echo "$a $b $c $d $e $f $g $h $rest" >> /path/to/file-$type
done < logfile

.... or again some combination with other suggestions to build your output name.

I'm not sure I've understood the question, but these are a few options for what I think you are after.

If I've got it wrong, can you post some input and the expected out with the relevant file names you want to generate and I will have another go.

Robin

#!/bin/ksh
while read -r line
do
        cname=$(echo ${line} | awk -F"[/ ]" '{print $10}')
        scode=$(echo ${line} | awk -F"[/ ]" '{print $(NF-1)}')
        [[ ( ${scode} -ge 200 ) && ( ${scode} -le 399 ) ]] && {
                echo ${line} >> ${cname}_access.log
                }
        [[ ( ${scode} -ge 400 ) && ( ${scode} -le 599 ) ]] && { 
                echo ${line} >> ${cname}_error.log
                }
done < logfile

Like that?

PLEASE BE AWARE:
As rbatte1 has stated, this code may be become quite performance heavy when encountered with large files.

1 Like

Dear pilnet101,

I'd worry that your code will be extremely heavy processing when confronted with a large input file. You will spawn processes for every record. My way will not and should run faster. Of course, a single awk to process all would be better still, but I do not have that knowledge.

Robin

1 Like

Thanks for the feedback rbatte1.

I do agree, parameter expansion is much more efficient. I have updated my last comment to advise OP accordingly.

1 Like

heyyy

---------- Post updated at 07:26 AM ---------- Previous update was at 07:22 AM ----------

Thank you very much pilnet
script is working as expected.
but I want keep all access and error logs their respective cname folders
could you please suggest to how to achieve ?

Regards,
Reddy.

What determines the cname? Can you supply your expected output and the required file names from this or a slightly larger sample?

Robin

Hi,
Thanks for script but not working below log

66.249.75.49 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/us/support/software-downloads/download.jsp HTTP/1.1" 200 3956 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"
127.0.0.1 - - [15/May/2014:00:12:02 +0000] "GET http://abc.def.com/80DF9D/plantronics/images/icons/forms/icon-field-required.gif HTTP/1.1" 200 486 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
127.0.0.1 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/css/global/forms.css HTTP/1.1" 200 8183 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
127.0.0.1 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/js/validation/validationFormDownloads.js HTTP/1.1" 200 1126 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
66.249.75.49 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/us/support/software-downloads/download.jsp HTTP/1.1" 200 3561 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"

Please help me on this

Script below

#!/bin/ksh
while read -r line
do
cname=$(echo ${line} | awk -F"[ ]" '{print $10}')
scode=$(echo ${line} | awk -F"[ ]" '{print $(NF-1)}')
[[ ( ${scode} -ge 200 ) && ( ${scode} -le 399 ) ]] && {
echo ${line} >> ${cname}_access.log
}
[[ ( ${scode} -ge 400 ) && ( ${scode} -le 599 ) ]] && {
echo ${line} >> ${cname}_error.log
}
done < /home/vizion/Desktop/adn_DF9D_20140515_0001.log

getting Error "

line 6: :: invalid character in expression - +http://www.google.com/bot.html)

"

Whether this is your expected output ?

$ cat log.awk
     {
	  split($7,X,/\/|\/\//)
          cname=X[2]
	  scode=$9
	  file = scode>=200 && scode<=399 ?  cname"_access.log": \
	         scode>=400 && scode<=599 ?  cname"_error.log" : \
	  	 cname"_other.log"
	  if(!(file in f))
	       {
		  print >file
		  f[file]
	       } 
	   else{
		  print >>file
               }
          close(file)
     }
$ cat file1
127.0.0.1 - - [03/Jun/2014:11:50:15 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=.Audio\x99+310 HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:50:41 +0530] "GET /configurator/ HTTP/1.0" 404 988
127.0.0.1 - - [03/Jun/2014:11:51:06 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=.Audio+655+DSP HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:51:11 +0530] "GET /uk/solutions/home-based-worker/work-shifting.pdf HTTP/1.0" 404 1093
127.0.0.1 - - [03/Jun/2014:11:51:25 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=ML18 HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:52:06 +0530] "GET /pl/solutions/successful-collaboration/ HTTP/1.0" 404 1063
$ awk -f log.awk file1
$ ls *.log -1
common_error.log
configurator_error.log
pl_error.log
uk_error.log
$ for i in *.log; do printf "%s\n\n" $i; cat $i; printf "\n";  done
common_error.log

127.0.0.1 - - [03/Jun/2014:11:50:15 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=.Audio\x99+310 HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:51:06 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=.Audio+655+DSP HTTP/1.0" 404 1048
127.0.0.1 - - [03/Jun/2014:11:51:25 +0530] "GET /common/support/resources/faqs.jsp?category=All&vfurl=%2FknowledgeSearch&t=All&c=All&k=ML18 HTTP/1.0" 404 1048

configurator_error.log

127.0.0.1 - - [03/Jun/2014:11:50:41 +0530] "GET /configurator/ HTTP/1.0" 404 988

pl_error.log

127.0.0.1 - - [03/Jun/2014:11:52:06 +0530] "GET /pl/solutions/successful-collaboration/ HTTP/1.0" 404 1063

uk_error.log

127.0.0.1 - - [03/Jun/2014:11:51:11 +0530] "GET /uk/solutions/home-based-worker/work-shifting.pdf HTTP/1.0" 404 1093

$ cat file2
66.249.75.49 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/us/support/software-downloads/download.jsp HTTP/1.1" 200 3956 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"
127.0.0.1 - - [15/May/2014:00:12:02 +0000] "GET http://abc.def.com/80DF9D/plantronics/images/icons/forms/icon-field-required.gif HTTP/1.1" 200 486 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
127.0.0.1 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/css/global/forms.css HTTP/1.1" 200 8183 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
127.0.0.1 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/js/validation/validationFormDownloads.js HTTP/1.1" 200 1126 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
66.249.75.49 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/us/support/software-downloads/download.jsp HTTP/1.1" 200 3561 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"
$ awk -f log.awk file2
$ ls *.log -1
abc.def.com_access.log
$ for i in *.log; do printf "%s\n\n" $i; cat $i; printf "\n";  done
abc.def.com_access.log

66.249.75.49 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/us/support/software-downloads/download.jsp HTTP/1.1" 200 3956 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"
127.0.0.1 - - [15/May/2014:00:12:02 +0000] "GET http://abc.def.com/80DF9D/plantronics/images/icons/forms/icon-field-required.gif HTTP/1.1" 200 486 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
127.0.0.1 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/css/global/forms.css HTTP/1.1" 200 8183 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
127.0.0.1 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/js/validation/validationFormDownloads.js HTTP/1.1" 200 1126 "-" "Serf/1.1.0 mod_pagespeed/1.6.29.7-6576" "-"
66.249.75.49 - - [15/May/2014:00:12:01 +0000] "GET http://abc.def.com/80DF9D/plantronics/us/support/software-downloads/download.jsp HTTP/1.1" 200 3561 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"

Now am getting which is expecting using below script

 #!/bin/ksh
while read -r line
do
        cname=$(echo ${line} | awk '{split($7,c,"/"); print c[3]}')
        scode=$(echo ${line} | awk -F"[ ]" '{print $9}')
        [[ ( ${scode} -ge 200 ) && ( ${scode} -le 399 ) ]] && {
                echo ${line} >> ${cname}_access.log
                }
        [[ ( ${scode} -ge 400 ) && ( ${scode} -le 599 ) ]] && {
                echo ${line} >> ${cname}_error.log
                }
done < /home/vizion/Desktop/adn_DF9D_20140515_0001.log

Thanks to all of who are helped in this

Note that when there is a leading space before the #! on the first line in a script, the entire line is just treated as a comment; it will NOT be used to determine the command interpreter to be used to process that script.

Also note that without CODE tags, we can't spot things like this.