Creating searches?

Hello. Could do with some help on where to get started really. If anyone could help me it would be greatly appreciated.

I have been working on this for a while now and I don't really know where to start but I am looking into creating a script that will process website hit files and output statistical information to the screen.

I have all of the hit files already and i have them populated like this:

137.44.2.8 Mon Feb 4 22:02:35 GMT 2008
149.192.2.81 Mon Feb 4 23:22:12 GMT 2008
132.53.17.171 Tue Feb 5 01:56:16 GMT 2008

What i want to do is create script(s) that will:

  1. Request from the user a particular time period of interest for them to search all of this information

  2. find

(i) the number of hits occurring during the given time period, and

(ii) the number of hits from unique IP addresses (counting two as one)

  1. Present this information in table with the headings �page�, �hits� and �unique hits�

Also all the hit files are stored in a hits directory and I want the script to be executed from the parent directory of hits.

Again if anyone can point me in a good direction this would be greatly appreciated.

All i can think of so far is using "if" functions. but this is the first time i have used unix. (im using SUSE10.3 if that helps?) i just really wanted some pointers/solutions.

thanks for your time.

I'd suggest working with perl for this as it's rather good at parsing files and creating nicely formatted reports.

Here's some pseudo-code to help get you going:

#!/usr/local/bin/do-what-i-want-not-what-i-write
print "Start date to search from: "
read from STDIN to $datestart
print "End date to search up until: "
read from STDIN to $dateend
print "File to search on: "
read from STDIN to $file

$hitcount=0
new array($iplist)

while (read $line from $file) {
  split $line into ($ip,$date)
  if ($date >= $datestart && $date <= $dateend) {
    $iplist[$ip]++
    $hitcount++
  }
}
$hitcount_unique=arraysize($iplist)
print_pretty_report($hitcount,$hitcount_unique)

Thread Closed ***************