How to extract duplicate rows

bobbygsk · January 17, 2008, 3:20pm

I have searched the internet for duplicate row extracting.
All I have seen is extracting good rows or eliminating duplicate rows.

How do I extract duplicate rows from a flat file in unix.
I'm using Korn shell on HP Unix.

For.eg.
FlatFile.txt

123:456:678
123:456:678
123:456:876
345:457:987
345:457:987
345:123:745

The output should be
OutPutFile.txt

123:456:678
345:457:987

I appreciate your help in advance. Thanks

Franklin52 · January 17, 2008, 4:00pm

awk '
{s[$0]++}
END {
  for(i in s) {
    if(s>1) {
      print i
    }
  }
}' file

Regards

radoulov · January 17, 2008, 4:53pm

Or, of course, if sorting is not a problem:

sort filename|uniq -d

bobbygsk · January 18, 2008, 10:50am

Gr8. Both scripts worked.

Thanks Franklin 52 and radoulov

Divya.M · November 20, 2008, 10:11am

This does not work when we have space between the data?

example:

1231080 5000104891 21592002082811037
1231080 5000104892 27492002082821037
1231080 5000104891 21592002082811037
1231080 5000104892 27492002082821037
934262 5000021182 27502002040110518
934262 5000021181 21552002040120518
934262 5000021182 27502002040110518
934262 5000021181 21552002040120518

jim_mcnamara · November 20, 2008, 10:31am

What does not work when there are spaces? $0 in awk refers to the entire row, spaces and all.

How to extract duplicate rows

For.eg. FlatFile.txt

The output should be OutPutFile.txt

For.eg.
FlatFile.txt

The output should be
OutPutFile.txt