Maximum Value of similar lines

Hi,
Pretty new to scripting sed awk etc. I'm trying to speed up calculations of disk space allocation. I've extracted the data i want and cleaned it up but i cant figure out the final step. I need to discover a Maximum value of 1 field where the value of another field is the same using awk

so my input looks like this;

orange 10
orange 20
orange 50
blue 10
blue 10
blue 5
green 30
green 40
green 50

The output I need is as follows

orange 50
blue 10
green 50

There are a few similar posts i've discovered but I cant make the final association.

Thanks in advance for any help received.
imarcs

awk '$2>a[$1]{a[$1]=$2}END{for (i in a) print i" "a}' file
1 Like

EPIC! Thank you very much for your help.

Bartus11 can you offer short explanation if you can find time.

Thank you very much
Peasant.

"a" is an associative array with key,value combination. Here key:value will be orange:10, orange:50, green:30 etc.
In the first run, a[orange] will be 10, in the second run, a[orange] will be 20 and so on
$2 prints the values i.e. 50, 30 etc, initially a[$1] will be blank

Algorithm used is

  1. start - a[$1]=0
  2. If $2(which is the value) > a[$1] then a[$1]=$2(newvalue)
  3. Read next line
  4. Repeat step 2 and 3 until EOF
  5. Print the array a

regards,
Ahamed

1 Like