Find duplicates in file with line numbers

Hello All,

This is a noob question. I tried searching for the answer but the answer found did not help me .

I have a file that can have duplicates.

100
200
300
400
100
150

the number 100 is duplicated twice. I want to find the duplicate along with the line number.

expected output

100 5

please help

Regards
David

Welcome to the forum.

Sure none of the links given at bottom left (under "More UNIX and Linux Forum Topics You Might Find Helpful") can't help? e.g. this one?

I did look at those links.
I tried

awk -F: 'x[$1]++ { print $1 " is duplicated"}' fileName

The output I get is

is duplicated

I want the duplicate value along with the line number

That seems an indicator your file has non-*nix DOS line terminators (<carriage return>, <CR>, \r, ^M, 0x0D). Remove before continuing, e.g. with dos2unix .

There's no colon in the file - no reason to set it as FS. Try

awk 'a[$1]++ {print $0, NR}' file
100 5
1 Like

If you don't want to use dos2unix before processing your input files, you can also build that code into your awk script:

awk '
{	sub(/\r$/, "")
}
a[$1]++ {
	print $0, NR
}' file

Note that the above code prints complete lines when the 1st field on the given lines is duplicated. If you want to only print the 1st field when the 1st field is duplicated, use:

awk '
{	sub(/\r$/, "")
}
a[$1]++ {
	print $1, NR
}' file

If you want to print complete lines when complete lines are duplicated, use:

awk '
{	sub(/\r$/, "")
}
a[$0]++ {
	print $0, NR
}' file
1 Like