Script to extract line from logfile

Hi ,
Can someone help me,I am not well versed with scripting,it is very urjent.

Need a script in perl or shell for the following task.

The logfile contains several entries of which one particular string to be searched and that complete line has to be removed and copied to another file to maitain a record.

The path of logfile "C:\webapps\data\servername\logs\logfile_systemdat.log

The string to be searched is "GuestUser"

The complete line is as follows.

10.100.102.5 - - [10-Sep-2009:00:01:59 -0700] 0 "GET /bsca/ APPS/1.1" 302 2 "-" "TMC Transaction Management Component Response Time Central/7.2.0+Action+GuestUser"

All the lines which contain the string Guestuser, has to be removed from the logfile and copied to another file to maintain a record.

The log files are generated on windows and Unix box - they are then copeid to windows box where the script has to run.

Thanks In Advance,
garry

Try this,

sed -i -e '/GuestUser/w output_file' -e '/GuestUser/d'  input_file

Regards,

Ranjith

Hi Ranjith,

Appreciate your quick responce.
Can I get the complete script from opening the file from the path to reading the log.
Cut the line from logfile and save the file.

I also want to make a note here.
There would be several entries in the logfile.
Some start with date and some start with ip and the remaining part of the line differs with the login.
The sed you mentioned will that remove the complete line.

Thanks alot,
Garry

This is how you do it using awk.

This will create a files based on the input file names. The file without the lines you want removed will be <filename>.strip . The file with the lines removed will be <filename>.sav . <filename> is the path to the input file.

First make a file containing the awk code. I used log.awk for the example.

# strip lines out of file and make log of lines removed
# clause to make file name for output
(FNR == 1) {                    # check if first line of new file
        if ( NR != 1)           # see if this is not first file
        {
                close(FILEOUT)  # if true, then we close previous files
                close(FILELOG)
        }
        FILEOUT = FILENAME ".strip"     # create striped file name using input file name
        FILELOG = FILENAME ".sav"       # create file name for removed lines
}

# clause for ever line
{
        if ( $0 !~ /ntp/ )      # replace ntp with your pattern
                print > FILEOUT # save to stripped file if pattern not found
        else
                print > FILELOG # save to remove log if pattern match
}

Change the pattern to whatever you want to use. (GuestUser?)

Then execute it using:

awk -f log.awk <list of files>

Where <list of files> is a bunch of files you want to process. You get a set of .sav and .strip files in the same directory as the source file.

Note: I don't believe in modifying source files when not necessary. Tools such as this should not modify input files.

If for some reason you want to do the same thing in a single line, use this format:

awk '(FNR == 1) { if ( NR != 1) { close(FILEOUT); close(FILELOG) } FILEOUT = FILENAME ".strip"; FILELOG = FILENAME ".sav" } { if ( $0 !~ /YOUR-PATTERN-HERE/ ) print > FILEOUT; else print > FILELOG }' <list-of-files>

Dear jp2542a,

Can you complete the script from line one for me.I understand what you write but I wont be able to complete the script as I do not have scripting knowledge.You will be a great help to me.

1.The source logfile has to be edited and all stripped lines should be copied to one file.
2. The path of the logfiles will be C:\webapps\data\servername\logs\logfile_systemdat.log
3. The script has to run everyday,based on system date, read thru the file for each server,strip the line - copy the line to test.txt and save the log file.

Thanks Alot,
Garry

I normally charge for when asked to produce a production ready product :).

What is your execution environment? (linux, solaris, windows ?)

---------- Post updated at 07:15 PM ---------- Previous update was at 07:14 PM ----------

I normally charge for when asked to produce a production ready product :).

What is your execution environment? (linux, solaris, windows ?)

Here's one way to do it with Perl:

@ARGV = ("f1");
$^I = ".bak";
open(OP, ">f1.out");
while (<>) {
  if (/xyz/) {print OP}
  else {print}
}

tyler_durden

Hi jp2542a,

Thankyou so much dear,anything you ask.

The execution environment is windows.

Thanks again,
Garry

I will do it... a couple of caveats/questions:

  1. Are you using the cgywin/bash envrionment?
  2. Are you willing to test the script and provide detailed feedback?
  3. What is the maximum airspeed of an unladen swallow? <joking>

Hi jp2542a,

It is Bash environment.
And why not I will provide you with the feedback.

Thanks,
Garry

I've restructured the awk script based on your request. We need to do a few things first.

First thing is to set up a test environment. Make a directory structure with files that starts with C:\gtest instead of C:\webapps. It should contain copies of a few of the webapps directory.

Then copy the following awk code into a file named log.awk. I've tried to add comments to help you understand what it is doing.

# log.awk - strip lines out of file and make log of lines removed

# This clause is executed on the opening of each file on the command line
# It first checks to see if this is not the first file and cleans up the
#  previous files and copies the new stripped file to the old file
#  NOTE:  I would prefer if it didn't overwrite the input file...
# It then creates the names and commands it will need later
(FNR == 1) {                    # check if first line of new file
        if ( NR != 1)           # see if this is not first file
        {
                close(FILEOUT)  # if true, then we close previous files
                close(FILELOG)
                system(FILECP)  # do the copy
                system(FILERM)  # remove the work file
        }

        FILEOUT = FILENAME ".strip"     # create striped file name using input file name
        FILELOG = FILENAME              # create test.txt path
        sub(/logfile_systemdat.log/, "test.txt", FILELOG)

        FILECP = "cp " FILEOUT " " FILENAME     # copy command
        FILERM = "rm " FILEOUT          # remove strip file
}

# This clause executes for every line
# It copies lines from the input file to the appropriate output file
{
        if ( $0 !~ /GuestUser/ )      # test for pattern in line
                print > FILEOUT # save to stripped file if pattern not found
        else
                print > FILELOG # save to remove log if pattern match
}

# This clause is executed when the last line of the last file is reached
END {
        system(FILECP)  # clean up last file
        system(FILERM)
}

Next execute the following command in the directory you put log.awk

awk -f log.awk C:\gtest\data\*\logs\logfile_systemdat.log

The * part should be a pattern that matches just the server names if there is something other than server names at that level.

Tell me if there are any error messages. Show me what you did and what you saw by copying the output screen. Check and see if you got what you expected in the test directories.

Once this works we will create a bash script that can be used by cron.

Hi jp2542a,

Sorry for the late reply.I was travelling.

I copied the script into a notepad file and saved it as log.awk
And then copied some logfiles to this path - c\gtest\servername\logfiles
But did not understand how to execute the - awk -f log.awk C:\gtest\data\*\logs\logfile_systemdat.log

saved this line in a .bat file and tried to execute.
Let me reiterate the script has to be executed in windows environment.
And I think I made a mistake by replying bash enviroment,sorry for it.
Please guide me.

Thanks,
Garry

Garry,

You gonna have to meet me half way here. While my main battle computer and fly-with-me laptop are windows based, I (and I suspect most here) am a Solaris/Linux/Unix/C/shell/awk/kernel kinda guy. Yeah, I do Windows... Just don't like to admit it... Most of the solutions here are targeted toward Unix kind of environments....

So, you need to install cygwin to get a proper Unix/Linux environment on Windows. Google it :). There are instructions on how to download and install it. If you need me to help you install it, I'm gonna charge you :slight_smile: .

Once you get it installed, then a bash shell (looks sorta like a Windows cmd prompt window). It will be in your start menu under cygwin.You will have a Unix/Linux/Solaris type shell environment that includes the awk command.

Simply type the awk command to the bash prompt.
:slight_smile:

Or you can try Gawk for Windows :rolleyes:

Yeah, but then I'm always shooting for the lowest common denominator. And then there is that whole at/cron thing that is part of Garry's requirement :smiley:

There are always more than one way to get'er done :slight_smile:

Hi JP,

The script removed GuestUser string from all the logfiles in the path.
I tried the following -

awk -f log.awk gtest/svtaree/*
awk -f log.awk gtest/*/*

And the output on the screen was blank

Few questions -

  1. Will the script work if the logfiles are located in different path.
  2. Some of the logfiles are in tar and zip format.
    3.The script should remove the GuestUser string from all the original logfile and copy to test.txt.

And sorry for troubling you dear.Was going thru a very hectice schedule and was only seeing to the word cygwin and not trying to understand it.

Thanks again,
Garry

A few things:

  1. The script is expecting a list of paths to the log file as specified in your original problem definition. Your wildcard path name will match to things other than the log file name you first defined. It will work those files too. That may not be somethnig you want.
  2. Only error messages are output to the screen. No message means no errors.
  3. There should be a file named test.txt left in the same directory as the log file containing the removed lines.
  4. The script is not designed to process compressed or archived files.

Is the test.txt file not being created? If there were other files that weren't log files of the type you specified originally, there may no text.txt file and the file that you did give it via the wildcarding might be modified.

Hi JP,

The logfile name is access_09.06.09.log
There is no other files in the server path except for logfiles.
And I did search for the test.txt file and it is not created

Thanks,
Garry

The script does not work because you changed the rules. The script expects the log file to be logfile_systemdat.log, as you originally specified. And it only expects to have a list of paths to these log files on the command line. Again, this was part of your original spec. The file test.txt is not created because the log file name is not logfile_systemdat.log.

Given that these forums are meant to give guidance rather than be a source of production quality code and your stated urgency, I suggest you hire a consultant to complete your project.

The script I wrote for you will work for your original spec and can be used as a basis for your changing requirements.

Hi JP,

As you are right I did metion the logfile_systemdate.log ( meaning name in string and then appended by the current system date dot.log)

Apologies if I have confused you.