Hi Ravinder,
Note that for(i in a) selects elements from array a in an unsepcified order. So, the output from your script won't necessarily be displayed in increasing time order (even if the input is in sorted order).
Hi fajar_3t3,
If you're going to call sort twice, there is no need to also invoke cat and awk . The command:
sort -k2,2nr test.txt | sort -t: -k1,1n -u
should produce the same output as the code you showed is in post #4 in this thread and run a little bit faster.
If your input file is in increasing time order (as shown in your sample in post #1), you could also try the single awk command:
awk -F '[: ]' '
function PrintHigh() {
if(NR > 1)
print HighLine
SaveHigh()
}
function SaveHigh() {
Hour = $1
HighLine = $0
HighValue = $NF
}
NR == 1 {
SaveHigh()
next
}
$1 != Hour {
PrintHigh()
next
}
$NF > HighValue {
SaveHigh()
}
END { PrintHigh()
}' test.txt
which should be still faster since only one process is invoked and the input is read only once and the hourly low-valued lines aren't written at all.
If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .
If the file is sorted, and the data exceeds 24 hours, then the results will show only the maximum value for an hour on any day rather than the maximum for each hour on every day.
MAX=0
PREV_HR=25
while read time count
do
hour=${time:0:2}
if ( $PREV_HR -eq 25 )
then
PREV_HR=$hour
fi
if ( $hour -ne $PREV_HR )
then
echo $PREV_HR $MAX
PREV_HR=$hour
MAX=0
fi
if ( $count -gt $MAX )
then
MAX=$count
fi
done
echo $PREV_HR $MAX