Hi,
Input
7488 7389 chr1.fa chr1.fa
3546 9887 chr5.fa chr9.fa
7387 7898 chrX.fa chr3.fa
7488 7389 chr21.fa chr3.fa
7488 7389 chr1.fa chr1.fa
3546 9887 chr9.fa chr5.fa
7898 7387 chrX.fa chr3.fa
Desired Output
7488 7389 chr1.fa chr1.fa 2
3546 9887 chr5.fa chr9.fa 2
7387 7898 chrX.fa chr3.fa 2
7488 7389 chr21.fa chr3.fa 1
7488 7389 chr1.fa chr1.fa 2
3546 9887 chr9.fa chr5.fa 2
7898 7387 chrX.fa chr3.fa 2
I want to count each line's occurrence and print its occurrence in the fifth column.
Even though the first and second columns (second and sixth records) are interchanged and fourth and fifth columns (first and fifth records) are changed, it still needs to be counted.
So, far I tried this and got the undesired output below
awk -F, 'NR==FNR{a[$0]++;next}{print $0 "\t" a[$0]}' input input
7488 7389 chr1.fa chr1.fa 2
3546 9887 chr5.fa chr9.fa 1
7387 7898 chrX.fa chr3.fa 1
7488 7389 chr21.fa chr3.fa 1
7488 7389 chr1.fa chr1.fa 2
3546 9887 chr9.fa chr5.fa 1
7898 7387 chrX.fa chr3.fa 1
---------- Post updated at 04:00 PM ---------- Previous update was at 03:34 PM ----------
Hi Corona,
Each line's occurence
For ex:
hello world
world hello
should be considered the same while reading the input. Then the output will be
hello world 2
world hello 2
because we are considering hello world is present two times in the file.