To find the IP adress in the log file

Hi,

I need to find out the repeated IP address from the apache log file from my box. I did try to come out with the script, but I could not grep out the repeated Ip Address from the error_logs and need to redirect to a file. Can you guys please help me out of this problem.

Thanks in Advance.

show us what you've tried and where your problems are...

the apache log file should have the IP address in Column one, so use awk or cut to get column 1, then sort the results through unique. The below will generate a list of IP address along with the number of times each was encountered sorted with the greatest number at the bottom.

cat access_log | awk '{print $1}' | sort | uniq -c | sort -n

Hi ldapswandog,

Thank you for your help... The command workout a lot. This command list all the IP address in the log file. But even more specific, how can i pull out the Ip Address who's entry has been repeatedly occurred in the log file. Exactly is, if an IP 192.168.1.20 if repeatly occured in my log file, The command should grep out the ips. I hope it make scene..

I use such a system to ban ip's that have made too many unsuccessful login attempts in a certain period of time. Imagine you have an access file like this one (extract):

Apr 26 15:56:53 monserveur sshd[30750]: Invalid user zoe from 89.110.150.203
Apr 26 16:00:10 monserveur sshd[30986]: Invalid user zachary from 89.110.150.203
Apr 26 20:18:15 monserveur sshd[5159]: Invalid user johnbe from 210.243.170.181
Apr 26 20:18:15 monserveur sshd[5159]: Invalid user allanz from 210.243.170.181
Apr 26 20:22:06 monserveur sshd[5341]: Invalid user frederik78 from 210.243.170.181
Apr 26 20:22:06 monserveur sshd[5341]: Invalid user xgridagent from 210.243.170.181
Apr 26 20:22:16 monserveur sshd[5349]: Invalid user xgridcontroller from 210.243.170.181
Apr 26 20:23:43 monserveur sshd[5419]: Invalid user zzz from 210.243.170.181
Apr 26 20:23:43 monserveur sshd[5419]: Invalid user zzz from 210.243.170.181
Apr 28 02:58:04 monserveur sshd[20403]: Invalid user xfs from 72.93.200.84
Apr 28 02:58:04 monserveur sshd[20403]: Invalid user xfs from 72.93.200.84
Apr 28 02:58:10 monserveur sshd[20409]: Invalid user zephyr from 72.93.200.84
Apr 28 03:02:18 monserveur sshd[20669]: Invalid user yellow from 72.93.200.84
Apr 28 03:02:39 monserveur sshd[20691]: Invalid user xxx from 72.93.200.84
Apr 28 03:03:22 monserveur sshd[20735]: Invalid user year from 72.93.200.84
Apr 28 14:16:32 monserveur sshd[6556]: Invalid user Zmeu from 88.191.46.60
Apr 28 14:17:14 monserveur sshd[6611]: Invalid user za from 88.191.46.60

The following code will extract all ip's that have made more than 2 unsuccessful attempts in one minute. You first need to build a awk array indexing on [date time ip]: Apr 28 20:18 123.123.123.123

awk -F'[ :]' '{_[$1 $2 $3 $4 $13]++} _[$1 $2 $3 $4 $13]>2 {print $13}' access.log
210.243.170.181
72.93.200.84

Hope this will put you on track.

Thank you for your reply . I have applied the command, but I am getting an error as

# awk -F'[ :]' '{_[$1 $2 $3 $4 $13]++} _[$1 $2 $3 $4 $13]>2 {print $10}' access.log

awk: cmd. line:2: fatal: cannot open file `access.log' for reading (No such file or directory)

Based on the file contect that @ripat provided using the cmdline script I provided the command will list how many times and IP address was found, which is what you requested.

cat test11.txt | awk '{print $NF}' | sort | uniq -c | sort -n
2 88.191.46.60
2 89.110.150.203
6 72.93.200.84
7 210.243.170.181

If you have a large access_log that has mostly single IP address access the remove the single entries from the output by using grep to remove them. the below remove any that appeared 2 or less times

cat test11.txt | awk '{print $NF}' | sort | uniq -c | sort -n | grep "[3-9] "

6 72.93.200.84
7 210.243.170.181

I count 5 pipes! You could enter the Useless Use of Cat Contest! (no offense!) :wink:
Useless Use of Cat Award

All that is possible in a awk one-liner with no piping. Can the OP provide sample file so that we can demonstrate the terseness of awk?

I offered the non-awk solution because it is self explanatory what it is doing and it uses commands that are used daily by most Unix users. The awk solution with no explanation what it is doing, for those like myself that are not familiar with awk, is just confusing. If parsing out IP addresses is an action that needs to be repeated often, I suggest using Perl, for two reasons. One, it is more efficient that shell solutions and two it is portable even to Windows systems so has a higher reuse value.

I agree that there is a tendency for awk coder to be as terse as possible but you can write awk code that is as readable as perl. Awk inspired Larry Wall to write Perl. So there is a little bit of awk philosophy and coding style in perl. And you can also obfuscate your code in perl:

#:: ::-| ::-| .-. :||-:: 0-| .-| ::||-| .:|-. :||
open(Q,$0);while(<Q>){if(/^#(.*)$/){for(split('-',$1)){$q=0;for(split){s/\|
/:.:/xg;s/:/../g;$Q=$_?length:$_;$q+=$q?$Q:$Q*20;}print chr($q);}}}print"\n";
#.: ::||-| .||-| :|||-| ::||-| ||-:: :|||-| .:|

Perl activity :wink:

I still believe that for this type of application (scanning a log file) awk is the best tool for the job. Portable, light and fast, very fast (specially the mawk version based on a byte code interpreter). You can also use awk on Linux, *bsd, windows and os-x platforms. Of course you can't do everything with awk. Although I have seen a http server written entirely in awk, when it comes to do complicated things you have to switch to a language like perl, php or python. But, again, for this type of problem, you can't beat awk.

So, if the OP can post a sample file, I promise to come up with a readable solution with comments. No obfuscated code anymore. Promised.