Hello,
I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited
My input file looks like this:
Colum1 Colum2 Colum3 Colum4 Coulmn5
1.1 100 100 a b
1.1 100 100 a c
1.2 200 205 a d
1.3 300 301 a y
1.3 300 301 a y
1.4 400 410 a b
1.5 500 510 a c
1.5 500 500 a d
1.5 500 500 a y
1.5 500 500 a y
and the desired output is
Colum1 Colum2 Colum3 Colum4 Column5 Column6
1.1 100 100 a b 1
1.1 100 100 a c 1
1.2 200 205 a d 1
1.3 300 301 a y 2
1.4 400 410 a b 1
1.5 500 510 a c 1
1.5 500 500 a d 1
1.5 500 500 a y 2
So far I have tried this
sort inputfile.csv | uniq -ci | awk '{print $0}' > freq.txt
This gives a frequency of 1 for all the rows and ends up sorting the output file. I want the output to be in its original form. Any suggestions ? Thank you.