getting duplicates

megh · August 21, 2008, 3:29am

how to get duplicates in a file containing data in columns using command or scripting?

otheus · August 21, 2008, 4:26am

Please explain more clearly.

pgop · September 13, 2008, 2:48am

You Could use uniq command to avoid duplicates...

jim_mcnamara · September 13, 2008, 7:32am

checking all columns against each other

awk '{ for(i=1; i<=NF; i++) {arr[$i]++} }
        END{for(i in arr){if(arr>1 {print i}  }}' file

finding duplicates in a given column -- 4

awk 'arr[$4]++' file | sort -u

pgop · September 14, 2008, 1:09am

Thank you Jim, for your reply..

a small doubt..

awk '{ for(i=1; i<=NF; i++) {arr[$i]++} }
        END{for(i in arr){if(arr>1 {print i}  }}' file

I am a newbie at using awk...want to know what is NF in the above code...

should be the number of columns in the file if i am not wrong. :rolleyes: