Identify duplicate values at first column in csv file

deadyetagain · October 16, 2015, 4:40pm

Input

1,ABCD,no 
2,system,yes 
3,ABCD,yes 
4,XYZ,no 
5,XYZ,yes
6,pc,no

Code used to find duplicate with regard to 2nd column

awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv

Now is there a wise way to de-duplicate the entire line (remove the duplicate) based on the criteria found within this one liner or wrapped around additional logic?

vgersh99 · October 16, 2015, 5:00pm

depending on what your desired output should be:

awk -F, '!a[$2]++{next} {print "Line " NR " " $2 " is duplicated"}' myFile
OR
awk -F, '!a[$2]++' myFile

Don_Cragun · October 16, 2015, 5:01pm

deadyetagain:

Input
1,ABCD,no 
2,system,yes 
3,ABCD,yes 
4,XYZ,no 
5,XYZ,yes
6,pc,no
Code used to find duplicate with regard to 2nd column
awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv
Now is there a wise way to de-duplicate the entire line (remove the duplicate) based on the criteria found within this one liner or wrapped around additional logic?

Your thread title says you are trying to find duplicates in the 1st field; your code prints lines in which the 2nd field on the line has been seen before. Note that it prints duplicates; it does not remove duplicates.

And, since there are no lines in your sample input where the 1st field is duplicated on any other line, I have no idea what you are trying to do. What additional logic are you talking about? What output are you hoping to produce from this sample input?

deadyetagain · October 16, 2015, 5:19pm

Does this remove the line when $2 is found to have been a duplicate?

---------- Post updated at 05:19 PM ---------- Previous update was at 05:17 PM ----------

Thank you for the suggestion.
I tried to change the title but that doesn't seem to be an option once the submission is made...

---------- Post updated at 05:19 PM ---------- Previous update was at 05:19 PM ----------

Thank you for the suggestion.
I tried to change the title but that doesn't seem to be an option once the submission is made...

vgersh99 · October 16, 2015, 5:45pm

I'd say it's for YOU to find out...