How can I remove partial duplicates and manipulate text?

tara123 · December 30, 2017, 6:00pm

Hello,

How can I remove partial duplicates and manipulate text in bash using either awk, grep or sed? Thanks.

Input:

ted,"foo,bar,zoo"
john-son,"foot,ben,zoo"
bob,"bar,foot"

Expected Output:

foo,ted
bar,ted
zoo,ted
foot,john-son
ben,john-son

Scrutinizer · December 30, 2017, 6:21pm

What have you tried so far?

tara123 · December 30, 2017, 7:07pm

this did not work.

perl -lpe 's/\s\K\S+/join ",", grep {!$seen{$_}++} split ",", $&/e'

Don_Cragun · December 30, 2017, 8:45pm

It is interesting that you want code written in awk , grep , or sed but show us non-working perl code.

You might be able to use something like:

awk -F, -v OFS=, '
{	gsub(/"/, "")
	for(i = 2; i <= NF; i++)
		if(!($i in seen)) {
			seen[$i]
			print $i, $1
		}
}' file

which, if file contains your sample input, produces the output you said you wanted.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

tara123 · December 30, 2017, 11:53pm

Thank you very much. It worked.