Maximum Value of similar lines

imarcs · May 15, 2011, 2:54am

Hi,
Pretty new to scripting sed awk etc. I'm trying to speed up calculations of disk space allocation. I've extracted the data i want and cleaned it up but i cant figure out the final step. I need to discover a Maximum value of 1 field where the value of another field is the same using awk

so my input looks like this;

orange 10
orange 20
orange 50
blue 10
blue 10
blue 5
green 30
green 40
green 50

The output I need is as follows

orange 50
blue 10
green 50

There are a few similar posts i've discovered but I cant make the final association.

Thanks in advance for any help received.
imarcs

bartus11 · May 15, 2011, 3:48am

awk '$2>a[$1]{a[$1]=$2}END{for (i in a) print i" "a}' file

imarcs · May 15, 2011, 4:26am

EPIC! Thank you very much for your help.

Peasant · May 15, 2011, 4:29am

Bartus11 can you offer short explanation if you can find time.

Thank you very much
Peasant.

ahamed101 · May 15, 2011, 4:38am

"a" is an associative array with key,value combination. Here key:value will be orange:10, orange:50, green:30 etc.
In the first run, a[orange] will be 10, in the second run, a[orange] will be 20 and so on
$2 prints the values i.e. 50, 30 etc, initially a[$1] will be blank

Algorithm used is

start - a[$1]=0
If $2(which is the value) > a[$1] then a[$1]=$2(newvalue)
Read next line
Repeat step 2 and 3 until EOF
Print the array a

regards,
Ahamed