How to print the names with a highest value within a set and filter if it is the only unique group within the same set
input
sets names value groups
j007 shot1 0.6 a
j007 shot2 0.5 b
j007 shot3 0.4 bb
j007 shot4 0.3 bc
j007 shot5 0.2 cd
j008 shot1 0.4 a
j008 shot2 0.3 ab
j009 shot1 0.14 a
j009 shot2 0.13 b
j009 shot3 0.12 bc
j010 shot1 22 a
j010 shot2 19 b
j010 shot3 5 bcd
j011 shot1 5 a
j011 shot2 2 b
j011 shot3 3 c
output
j007 shot1 0.6 a
j009 shot1 0.14 a
j010 shot1 22 a
Tried
sort -k 3,3 input | awk '$3*$3>A[$1]*A[$1]{A[$1]=$0} END{for(i in A) print i,A}' | sort -k 1,1
Thanks. But the group has to be always unique. Sorry maybe I didn't explain well. For example, first, the group with highest score 'a' should be unique. Means no ab or abc etc. Therefore j008 is not in the output. Second, there should be no other group that should be unique with in the same set. For example, j011 has 3 unique groups. Therefore it should not be in output. Hope that's clear? With your script, j009 is missing and j010 with wrong group is being selected.
For my understanding, let me paraphrase your request:
In any set, look for the maximum value Col 3). These are listed below:
sets names value groups other_groups
j007 shot1 0.6 a b bb bc cd
j008 shot1 0.4 a ab
j009 shot1 0.14 a b bc
j010 shot1 22 a b bcd
j011 shot1 5 a b c
Now, if the group(s) of this entry show up in any of the other entries of the same set, suppress this record. If so, ALL entries EXCEPT j008 should be printed, no?
Sorry, you crossposted while I was pondering. So - to be eliminated, the group has to be unique, i.e.one single letter, and this letter may not occur in any of the other, possibly multiletter, groups it the same set, nor may any other single letter group occur in that set?
@RudiC: First part is correct. Second part noy exactly. If you do Venn diagram with the letters in the groups of a specific set, you should always see 'a' as a separate group. For example, j007 has this type but not j008. Next, though j009, j010, j011 have 'a' as a separate group, j011 has also 'b' and 'c' as separate groups. Therefore only j007, j009 and j010 are in the output.
---------- Post updated at 04:18 PM ---------- Previous update was at 04:17 PM ----------
@RudiC: Update: yes your update is correct. Sorry for the confusion.
---------- Post updated 06-13-17 at 02:34 AM ---------- Previous update was 06-12-17 at 04:18 PM ----------
@rdrtx1: It is still not working I think. For ex, when I ran the modified script on this input, it not suppose to print 'j007' but instead it prints with group 'a'. This should not be printed because there is another unique group ('e') in the data.
sets names value groups
j007 shot1 0.6 a
j007 shot2 0.5 b
j007 shot3 0.4 bc
j007 shot4 0.3 c
j007 shot5 0.2 cd
j007 shot6 0.1 e
Please rephrase verbosely and in great detail the conditions; and: what do you mean by "filter"? Eliminate? Print and eliminate others?
For me, the example in post#7 should NOT print, as more than one single letter groups exist. And, why group "e" and not "c"?