Diya123
September 14, 2011, 3:11pm
1
Hi
I have a list of 2000 records with multiple entries and I want to get the max size for each entry
ABC 1
ABC 2
ABC 3
ABC 4
DEF 1
DEF 2
DEF 2
DEF 2
DEF 2
DEF 3
DEF 4
XYZ 1
XYZ 2
XYZ 3
XYZ 3
XYZ 3
XYZ 4
XYZ 4
XYZ 4
XYZ 5
so on..
I have presented here 3 different cases.
In the first case ABC all entries are only once.. So the max size for this 1
In second case DEF has "2" occurring four times so the max size for this 4
In third case XYZ both "3" and "4" are occurring three times so the max size is 3
output:
ABC 1
DEF 4
XYZ 3
Thanks,
awk '{ if(A[$1] < $2) A[$1]=$2; }
END { for(k in A) { print k, A[k]; }' < file
Diya123
September 14, 2011, 5:27pm
3
Hi,
Thanks for the reply..
The code does not work as per my requirement. its outputting the last number in the entry ( example
ABC 1
ABC 2
ABC 3
ABC 4
ABC 4
ABC 5
Instead of outputting as 2 its outputting 5
What I need is the highest number of times a number is repeating for a particular entry. In the above example of all "4" is repeating two times. So the output should be "2".
Thanks,
Whatever you were running, it wasn't what I posted: It had a syntax error and didn't run at all :wall:
[edit] Ah, I see... Hmm... Working on it.
Diya123
September 14, 2011, 5:32pm
5
Hi,
I have resolved the erorr in the code and then used it.. Only after that I got the error.
Thanks,
Diya
That's what I get for answering too fast... Here's a solution that does what you want:
$ awk '{ A[ $1 "#" $2 ]++; }
END { for(K in A)
{
split(K, L, "#");
STR=L[1] ; VAL=L[2]
if(C[STR] <= A[K])
{
C[STR]=A[K];
T[STR]=VAL
}
}
for(K in T) print K, T[K];
} < data
ABC 4
XYZ 3
DEF 2
$
There's an inconsistency in your example though. If we get a pattern like
A 1
A 1
A 2
A 2
which should be chosen, 1 or 2? Your example has ABC choosing the first max and DEF choosing the last max...
To choose the first instead of the last, change
if(C[STR] <= A[K])
to
if(C[STR] < A[K])
Diya123
September 14, 2011, 6:03pm
8
Sorry if I was unclear about my question.
The code is outputting the number which is repeating maximum times. What I want is to output the max times its repeating
for instance:
ABC 1
ABC 2
ABC 3
ABC 4
ABC 5
ABC 5
ABC 5
ABC 6
ABC 6
ABC 7
ABC 7
ABC 7
ABC 7
ABC 7
ABC 8
ABC 8
ABC 9
ABC 10
In this example 7 is rpeating the maximum number of times. Its repeating five times so the output should be 5.. The code what you sent earlier outputs "7" instead of "5".
The other example which you mentioned
A 1
A 1
A 2
A 2
In this scenario 2 is the maximum times a number(either 1 or 2) is repeating. So the output is 2
Thanks,
Okay.
awk '{ A[ $1 "#" $2 ]++; }
END { for(K in A)
{
split(K, L, "#");
STR=L[1] ; VAL=L[2]
if(C[STR] <= A[K]) C[STR]=A[K];
}
for(K in C) print K, C[K];
}' < data
mixing of command
$ awk '{print $1}' inputfile | sort -u | while read word; do sort inputfile| uniq -c | sort -r -n -k1 -k2 | grep $word | head -1; done
1 ABC 4
4 DEF 2
3 XYZ 4