Read values from second file in awk

EAGL · May 15, 2014, 9:37am

Hello Friends,

I have written a script like the following, which finds some logs and fetchs some desired rows and then calculate average processing time of a spesific application.

if [ "$#" -ne 3 ]
then
        echo 
        echo "-----  There are three arguments which is expected to run this script!  -----"
		echo "-----  You have written : $0 $@  -----"
		echo "-----  USAGE: "$0" [Full path of desired directory] [State] [ Date : since how many days ago the script should check the logs? ]  -----"
        echo "-----  i.e.  "$0" /opt/data/logs 1000 7  -----"
        echo
elif [ "$#" -eq 3 ]
then
        log_dir=$1
        first_char=$"log_dir:0:1"
        echo "$first_char"
		 
        if [ "$first_char" -eq "/" ] && [ -d "$log_dir" ]
        then  
			
			for j in `find $log_dir -type f \( -name "profile[1-4].log" -o -name "profile[1-4].log.2014-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]" \) -mtime -$3 2>/dev/null | sort -t- -k 4,4`;
			do
			echo "$j" && nawk -v stat=$2 '/State=\[stat\]  Action.*in/gsub(/^\[|\]/,"",$(NF-1)){sum+=$(NF-1)} END { print "Average Process Time = ",sum/NR}' $j;
			done;

		else
			echo 
			echo "-----  You have written  : $0 $@  -----"
			echo "--- Enter the full path of logs which is first argument  \""$1"\"  correctly ---"
			echo "-----  USAGE: "$0" [Full path of desired directory] [State] [ Date : since how many days ago the script should check the logs? ]  -----"
			echo "-----  i.e.  "$0" /opt/data/logs 1000 7  -----"
			echo
		fi
fi

There are three arguments can be given to this script,

log directory, State, date,

I want "state" value to be read from another file like the following example and then get the equavalent values from this file and print in NAWK

file2:

1000 ChargingRenewal State
2000 Notification State
7000 Fulfilment State
....

so what i want is to provide the following XXXX part from second file, is it possible? How could i achive this?

echo "$j" && nawk -v stat=$2 '/State=\[stat\]  Action.*in/gsub(/^\[|\]/,"",$(NF-1)){sum+=$(NF-1)} END { print "Average Process Time for XXXX State = ",sum/NR}' $j file2

Appreciate your ideas, I'm stuck at this, I know i can define another variable for the Nawk using "-v" but how i can retrive XXX values from second file from $2 column after a comparision?

SriniShoo · May 15, 2014, 10:44am

I couldn't understand your requirement correctly. but to do action based on second file using awk, use below

FILENAME == ARGV[2]

EAGL · May 15, 2014, 12:12pm

Hello Srinishoo,

My scripts sums up NF-1 and calculate average value of processing times from logs, so the output is like the following ( i got this from production server):

Average Process Time =  211.063
/data/log/rfe/profile4.log
Average Process Time =  209.939
/data/log/rfe/profile3.log.2014-05-15-00
Average Process Time =  227.659

The second argument of the script is "State", and there are several states and processing times changes according to states in the logs,

so what i intend to do is, to add to the output file for which state the processing time is calculated. In order to achieve this i need to put all the states into another file (like a look up file) and compare the given state number ( arg $2 ) to appropriate explanation from second file and print the state name to the output:
so script would be run for state 1000 like the following:

./script.sh /data/log/rfe 1000 1

and state 1000 is equal to "ChargingRenewal" value in second file

so desired output is:

Average Process Time for ChargingRenewal State =  211.063
/data/log/rfe/profile3.log
Average Process Time for ChargingRenewal State =  209.939
....
....

i hope i could make it a bit more clear.

KR,

cnamejj · May 15, 2014, 2:03pm

I have two ideas that might help.

First, if you're up for re-thinking the way the script works you can take advantage of the fact that "awk" can read multiple files sequentially. So instead of running separate "awk" commands for each file, and having to use a temporary file to relay the "state" info forward, you can do it all at once.

The "awk" variable "FILENAME" will be set to the name of the file being read so you can code the script to treat the set of data from each file differently. And since it's all one "awk" command, you can store the "state" in an array for use as you run through the files. For instance, here's a really, really simple example that shows the number of lines in each file it reads.

gawk '{ lines[FILENAME]++ } END { for(fn in lines) printf "%s has %d lines\n", fn, lines[fn] }' /tmp/foo*

If you'd rather keep the logic the way you have it, meaning running "awk" once per file, then you can pass in a variable with info from another file. Doing something like this,

awk -v stlist="$(awk 'NF { printf "%s ", $1; }' secondfile)" '..awk program here...'

will pass in a variable called "stlist" that holds a blank delimited string of all the first words from "secondfile". You can split it apart and code your script to use the data as you like.

RudiC · May 16, 2014, 5:50am

I'm not sure either I understood your request correctly, nor am I convinced your awk will run fine. But, to output the relevant state's name, add this to your awk script and put file2 as first argument:

awk -v stat=$2 'NR==FNR && $1==stat {STATE=$2} ... END {print "... for " STATE "state =", ...} ' file2 $j

EAGL · May 16, 2014, 8:14am

rudic:

I'm not sure either I understood your request correctly, nor am I convinced your awk will run fine. But, to output the relevant state's name, add this to your awk script and put file2 as first argument:
awk -v stat=$2 'NR==FNR && $1==stat {STATE=$2} ... END {print "... for " STATE "state =", ...} ' file2 $j

Thanks a lot Rudic, that was what exactly i wanted,

Now i can use a second file to read properties from.

If it is suitable me asking another thing about the printing results, i would like to find out how i can mark spesific results with a star "*" at the end, where average processing time is larger than 300;

this is the code part:

echo "$j" && nawk -v stat=$2 'NR==FNR && $1==stat{STATE=$2}/State=\[stat\]Action.*in/gsub(/^\[|\]/,"",$(NF-1)){sum+=$(NF-1);avg=(sum/NR)} END { print "Avg Proc Time for " STATE " =",avg}' state.cfg $j >> avg_proc_time_results.txt;

7

the output is now like the following:

-bash-3.00$ cat avg_proc_time_results.txt 
Avg Proc Time for Charging = 199.98
Avg Proc Time for Charging = 200.021
Avg Proc Time for Charging = 214.448
Avg Proc Time for Charging = 213.565
Avg Proc Time for Charging = 170.646
Avg Proc Time for Charging = 168.457
Avg Proc Time for Charging = 265.9
Avg Proc Time for Charging = 258.402

desired output:

-bash-3.00$ cat avg_proc_time_results.txt 
Avg Proc Time for Charging = 199.98
Avg Proc Time for Charging = 200.021
Avg Proc Time for Charging = 214.448
Avg Proc Time for Charging = 213.565
Avg Proc Time for Charging = 170.646
Avg Proc Time for Charging = 168.457
Avg Proc Time for Charging = 265.9*
Avg Proc Time for Charging = 258.402*

KR
Eagle