I have the following records from multiple files.
415 A G
415 A G
415 A T
415 A .
415 A .
421 G A
421 G A,C
421 G A
421 G A
421 G A,C
421 G .
427 A C
427 A C
427 A .
427 A .
1) i wanted to remove the columns which have "." in third column
2) count the columns and merge based on first column
I want output like this
3 2,1 415 A G/T
5 3,2 421 G A/A,C
2 427 A C
first column "3 2,1 415 A G/T"
3 - how many times 415 is repeated
2,1 - if i count uniq it is giving two times of " 415 A G" and one time "415 A T" pattern. so i wanted to merge this and get final as " 3 2,1 415 G/T"
I used this command to count unique but unable to merge and combine the columns
cat file | awk '$3 ~/A|T|G|C/{print $0}'| sort | uniq -c
By using above code i am getting the following output
2 415 A G
1 415 A T
3 421 G A
2 421 G A,C
2 427 A C