Optimized way of doing the task in shell programming

Hi

I have a file consists of the following similar lines (10 mb file)

2008-05-15 02:15:38,268 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.

2008-05-15 02:15:38,277 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.

My task is to find out any missing second lines for corresponding first lines.

I have written a script using bash, perl and awk commands but it takes lot of time to complete it. is there a better way to do this task ?

here is my current script:

#FILE contains a list of connection and return statements
FILE=summary_db_conn.txt
#DUP_FILE : Copying as another file so that i can do a loop to find out the missing return statements
DUP_FILE=summary_db_conn_dup.txt
#FINAL_FILE = file contains setting session records which don't have corresponding return statements
FINAL_FILE=final_not_matching_records.txt

# To see whether set connection statements have corresponding return connection statements

while read line; do

temp_output=`echo $line | grep "Setting session state for connection" | awk '{ print $1, $2, $4, $5, $6, $7 }' | perl -wn -e 'print "$1\n" if m{[\[](.*?)[\]]};'`
if [ "$temp_output" != "" ]; then
echo "$line :Processing line "
get_time=`echo $line | grep "Setting session state for connection" | awk '{ print $1,$2 }'`
#echo "get_time $get_time"
flag="0"
while read file_line; do
#echo "Match line : $file_line"

            if [ "$flag" = "1" ]; then
                        temp\_file_output1=\`echo $file_line | grep "Returning WS connection" | awk '\{ print $1, $2, $4, $5, $6, $7 \}'  | perl -wn -e 'print "$1\\n" if m\{[\\[]\(.*?\)[\\]]\};'\`
                        if [ "$temp\_file_output1" != "" ]; then
                                    if [ "$temp_output" = "$temp\_file_output1" ]; then
                                        \# echo "Found the matching return statement"
                                         echo "$file_line"
                                         echo "Break: Success"
                                         break
                                     fi
                         else
                                temp\_file_output1=\`echo $file_line | grep "Setting session state for connection" | awk '\{ print $1, $2, $4, $5, $6, $7 \}'  | perl -wn -e 'print "$1\\n" if m\{[\\[]\(.*?\)[\\]]\};'\`
                                if [ "$temp\_file_output1" != "" ]; then
                                         if [ "$temp_output" = "$temp\_file_output1" ]; then
                                               echo $line >> $FINAL_FILE
                                               \#echo "Searching for:$line"
                                               echo "Sorry found another session start statement with the same thread $line"
                                               echo "The new Line is:$file_line"
                                               break
                                          fi
                                 fi
                          fi
            fi
            temp\_file_output=\`echo $file_line | grep "$get_time"\`

            if [ "$temp\_file_output" != "" ]; then
                    \#echo "Found line"
                    \#echo "$file_line"
                    flag="1"
            fi

 done < $DUP_FILE

fi

done < $FILE

echo "Program Ended`date`" >> $FINAL_FILE

Please let me know is there a fast and better way of doing this task ....

Thanks

Using awk or Perl, read the "Setting state for connection" keys into an associative array and remove a key from the array when you see the corresponding Returning line. At end of file, any keys still left in the array will be without a pair. I don't suppose you expect to find Returning lines without an opening Setting line.

As a general observation, anything with grep | awk is a waste because awk can usually do anything grep can, and similarly, awk | perl looks like one or the other is redundant.

Era,

thanks for your reply. Initially i thought of using arrays but don't know how to proceed with actual programming. I tried but failed with arrays as I am not a strong shell programmer(Occasional user)

I know it's not good to ask a program but could you please show me the things you mentioned about.

thanks in advance for your help

Can you post a sample with three or four of the buggers to test with?

Here i should use the word between [ ] for comparision for example AQUEDUCT-WebContainer : 0 has set connection as well as return connection.

Here are the sample data file. This file sample contains all kinds of missing things

2008-05-15 02:15:04,057 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,067 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,230 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,239 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,359 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,367 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,513 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,582 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,582 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,590 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,675 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,681 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,731 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,740 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,148 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,155 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,293 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,587 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,596 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,726 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,732 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,839 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,849 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,896 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,905 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,989 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,057 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:06,127 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,178 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,186 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:06,323 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,332 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Returning WS connection.

Save this to a file and chmod +x.

#!/usr/bin/perl -n

if (m/(.*) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Setting session/) {
    $open{$2} = $1;
    next;
}

if (m/(.*) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Returning WS /) {
    if ($open{$2}) {
        delete $open{$2};
    }
    else {
        warn "Return without opening session: $_";
    }
}

END {
    for my $k (keys %open)
    {
        print "$open{$k}: $k\n";
    }
}

I tested by removing one of the Returning lines and it printed the corresponding Setting line (or rather, the time stamp and identifier), but that's obviously not very massive testing.

Era,

I am getting the following error

check_test.sh: line 3: syntax error near unexpected token `.'
check_test.sh: line 3: `if (m/(.
) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Setting session/) {'

while executing your script as follows
sh check_test.sh summary_db_conn.txt

thanks

It's a Perl script, not a shell script.

Era,

Sorry for not mentioning my requirements correctly in the fist place.

My requirement is to to find out setting session lines which don't have corresponding return ws lines.

In this case, "setting session" line "returning WS" line does not need to be consecutive. for example the following sample is correct.

2008-05-15 02:15:04,675 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,731 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,681 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,740 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.

So i need to find out "setting session" lines which don't have corresponding "returning WS" lines and the linkage between those two is thread name for example [AQUEDUCT-WebContainer : 1]

Suppose if the follows pattern is there in a file then the first "setting session" line should be counted in missing things.

2008-05-15 02:15:04,675 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,731 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,681 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.

Hope this clarifies on my requirements .. i really appriciate your help on this ..

Thanks in advance

Finding out what is missing is harder then, but maybe the following would work for you. It simply counts the numbers, plus for Setting and minus for Return.

#!/usr/bin/perl -n

if (m/(.*) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Setting session/) {
    ++$open{$2};
    next;
}

if (m/(.*) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Returning WS /) {
    if ($open{$2}) {
        --$open{$2};
    }
    else {
        warn "Return without opening session: $_";
    }
}

END {
    for my $k (keys %open)
    {
        print "$open{$k}: $k\n";
    }
}

It looks good but what i would like to achieve is

the program should print the following lines of output in given sample data file

2008-05-15 02:15:05,293 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection

2008-05-15 02:15:06,127 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection

Here is the sample data file

2008-05-15 02:15:04,057 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,067 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,230 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,239 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,359 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,367 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,513 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,582 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,582 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,590 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,675 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,681 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:04,731 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:04,740 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,148 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,155 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,293 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,587 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,596 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,726 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,732 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,839 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,849 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,896 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:05,905 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:05,989 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,057 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:06,127 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,178 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,186 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Returning WS connection.
2008-05-15 02:15:06,323 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-05-15 02:15:06,332 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Returning WS connection

Your inputs are great and thank you very much for that ..

This one simply prints a warning if there are two consecutive Setting lines without a corresponding Returning in between.

#!/usr/bin/perl -n

if (m/(.*) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Setting session/) {
    warn $open{$2} if $open{$2};
    $open{$2} = $_;
    next;
}

if (m/(.*) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Returning WS /) {
    if ($open{$2}) {
	delete $open{$2} 
    }
    else {
        warn "Return without opening session: $_";
    }
}

END {
    for my $k (keys %open)
    {
        print "$open{$k}: $k\n";
    }
}

Thank you so much sir and sorry for late reply.:b:

hi

I thought the previous program works for my requirement but delete is deleting all matching keys in the array and that could be the reason i am not getting missing setting state session statements from the array.

I found a very good utility 'StackedHash' perl module at

Data::StackedHash - Stack of PERL Hashes - search.cpan.org

I got it worked almost but some how it's not displaying any output. I thought of share with you so that i can get some advise on this.

Please find the complete details again here so that you don't need to read my previous email threads.

By the way, i was out of the town on medical urgency in my family and that's the reason for the delay.

Thank you in advance for your help..

Requirement:

Find out setting session state connection statements which don't have corresponding Returning WS connections in a given input file

check_test_3_1.pl

#!/usr/bin/perl -n

use Data::StackedHash;
tie %array1, Data::StackedHash;

if (m/(.) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Setting session/) {
tied(%array1)->push({$2=>$_});
next;
}
if (m/(.
) \[DEBUG\] \[([^][]+)\] RMSConnectionFactory - Returning WS /) {
if ("$2" ne "" ) {
delete $array1{$2};
}
next;
}

END {
print values %array1;
}

Input file:

2008-06-05 03:29:37,752 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:37,761 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:38,061 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:38,069 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:38,292 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:38,301 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:38,442 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:38,506 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:38,572 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:38,579 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:38,762 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:38,768 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:38,961 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:38,969 [DEBUG] [AQUEDUCT-WebContainer : 0] RMSConnectionFactory - Returning WS connection.

2008-06-05 03:29:39,292 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection.

2008-06-05 03:29:39,772 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:39,839 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:40,261 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:40,272 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.

2008-06-05 03:29:40,701 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.

2008-06-05 03:29:41,291 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:41,300 [DEBUG] [AQUEDUCT-WebContainer : 3] RMSConnectionFactory - Returning WS connection.
2008-06-05 03:29:41,542 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:41,551 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Returning WS connection.

Expected Output:

2008-06-05 03:29:40,701 [DEBUG] [AQUEDUCT-WebContainer : 1] RMSConnectionFactory - Setting session state for connection.
2008-06-05 03:29:39,292 [DEBUG] [AQUEDUCT-WebContainer : 2] RMSConnectionFactory - Setting session state for connection

What i am getting right now:

Nothing :frowning: program runs successfully but with no output ..

Run command:

perl -w check_test_3_1.pl input.txt

Dependency Perl Modules:

Data::StackedHash - Stack of PERL Hashes - search.cpan.org

Based on a quick reading of the manual page, I don't think you are using it correctly. You are pushing but never pulling. The push method does not expect or support the passing of an argument. I also don't see how this is supposed to bring you closer to a solution.

In the meantime, the script I posted above does seem to produce the expected output for the sample input you supplied. And no, the delete command only deletes the specified element from the hash, not the whole hash (that would be undef -- see the delete documentation for more).