Count help - newbie

I have a flat file sort by Phonenumber and Bigin call fileA.txt
Phonnumber|Begin|Endca|D1|D2|Diff
4159061234|10:00|10:01|a1|a2|60
4159061234|10:00|10:06|b1|b2|360
4159061234|10:05|10:06|c1|c2|60
4159061234|10:12|10:15|d1|d2|180
3045678934|10:25|10:28|x1|x2|180
3045678934|10:25|10:30|y1|y2|300
3045678934|10:28|10:31|z1|z2|180
....................

How do I write a code in ksh so it will check the
phone number and if phone number with same begin call then count
and give outout.txt as a result

Phonnumber|Begin|Endca|D1|D2|Call Times
3045678934|10:25|10:28|x1|x2|2
3045678934|10:28|10:31|z1|z2|1
4159061234|10:00|10:01|a1|a2|2
4159061234|10:05|10:06|c1|c2|1
4159061234|10:12|10:15|d1|d2|1

......................................

Thanks

===

Here is my code

but my asnswer is
3045678934,10:25,10:28,x1,x2,1
3045678934,10:28,10:31,z1,z2,1
4159061234,10:00,10:01,a1,a2,1
4159061234,10:05,10:06,c1,c2,1
4159061234,10:12,10:15,d1,d2,1

I tried but my count is incorrect, how do you count it ? please help. Thanks

Suggest you first read the rules about bumping your post - this will not get you a quicker response.

Quick and dirty solution if the file isn't too big:

IFS='|'
while read a b
do
  print $a"|$b"'|'$(grep -c ^$a z.dat)
done < z.dat

Sets the field separator to the pipe character then for each line print the original plus count of matches on the first field. Pro - simple, and no need for variables to hold counts or to sort the input, Anti - a grep for each line instead of for each unique value. You pays your money and takes your choice :slight_smile: .

cheers

The code you have gave me this answer

4159061234|10:00|10:01|a1|a2|60|4
4159061234|10:00|10:06|b1|b2|360|4
4159061234|10:05|10:06|c1|c2|60|4
4159061234|10:12|10:15|d1|d2|180|4
3045678934|10:25|10:28|x1|x2|180|3
3045678934|10:25|10:30|y1|y2|300|3
3045678934|10:28|10:31|z1|z2|180|3

It is not what I expected to get.
Thanks,

nawk -f brit.awk fileA.txt

brit.awk:

BEGIN {
  FS=OFS="|"
}

{
  idx= $1 SUBSEP $2
  NF--; $1 = $1
  if (!(idx in arr))
     arr[ idx ] = $0
  cnt [idx]++
}
END {
  for ( i in arr )
    print arr, cnt
}

vgersh99, your code is working but can you show me why mine did not work ?
I aslo add if diff < 600 then get the record, because I dont want to see any diff greater than 600 in there. Would you please show me how to fix my code to make it work like yours ? Thanks

sorry I don't have time to understand your code.
I've changed mine for the 'diff' logic':

BEGIN {
  FS=OFS="|"
}

{
  if ( int($NF) >= 600 ) next

  idx= $1 SUBSEP $2
  NF--; $1 = $1
  if (!(idx in arr))
     arr[ idx ] = $0
  cnt [idx]++
}
END {
  for ( i in arr )
    print arr, cnt
}

Sorry Britney - didn't read the question properly :o :o :o
I can't improve on the awk solutions you've already had. Your requirement is a bit too subtle for the usual solutions for this sort of problem using sort/uniq etc and a straight ksh script (looping through, comparing keys and counting duplicates then printing on change of key) works fine but is pretty inelegant.

cheers

Sorry, newbir writing code :slight_smile:
vgersh99, your code work great but it show array

for ( i in arr )
print arr[i], cnt

[i]That's mean all field then comma count
what if I need to rearrange like add count in 3rd field and remove diff=$6 from your code how do i do it ? I am sure you will not save to another file and do awk again to rearange it .
Thanks,

adding the 'count' in the THIRD field:

BEGIN {
  FS=OFS="|"
  FLDinsert="3"
}

{
  idx= $1 SUBSEP $2
  NF--; $1 = $1
  if (!(idx in arr))
     arr[ idx ] = $0
  cnt [idx]++
}
END {
  for ( i in arr ) {
    n=split(arr, tmpA, OFS)
    tmpA[FLDinsert] = cnt OFS tmpA[FLDinsert]
    for(iter=1; iter <= n; iter++)
        printf("%s%s", tmpA[iter], (iter < n) ? OFS : "\n")
  }
}

wow .. thanks alot, I need to study more and read more to understand .
Thanks again