How to find duplicate line in Linux?

ken6503 · July 30, 2013, 4:56pm

Hi, Gurus,

I need find the duplicate record in unix file.

what command I should use for this.

Thanks in advance

rdrtx1 · July 30, 2013, 4:58pm

try:

awk 'a[$0]++' infile

ken6503 · July 30, 2013, 5:00pm

Thanks for your quick reply.

it works perfect.

Thanks again

gacanepa · July 31, 2013, 1:01am

It works like a charm for me as well. Would you mind very much explaining what 'a[$0]++' means?

Yoda · July 31, 2013, 1:24am

It is an associative array indexed by whole record and the value is post-incremented.

So for first occurrence of each record the value will be zero due to post-increment. But for next occurrence it will be a non-zero value.

A non-zero value is evaluated as true, hence the default awk action is to print that record.

Your can run below program to understand what is going on:

awk '{ print $0, a[$0]++ }' file