remove duplicate words in a line

sam_2921 · March 18, 2009, 6:03am

Hi,

Please help!
I have a file having duplicate words in some line and I want to remove the duplicate words.
The order of the words in the output file doesn't matter.

INPUT_FILE
pink_kite red_pen ball pink_kite ball
yellow_flower white no white no
cloud nine_pen pink cloud pink nine_pen
brown_ball white
red_bear green red_bear
white no

OUTPUTFILE
pink_kite red_pen ball
yellow_flower white no
cloud nine_pen pink
brown_ball white
red_bear green
white no

Your help is highly appreciated.
Thanks in advance

rubin · March 18, 2009, 11:00am

awk '{ while(++i<=NF) printf (!a[$i]++) ? $i FS : ""; i=split("",a); print "" }' file

ShawnMilo · March 18, 2009, 11:10am

#!/usr/bin/env python

for line in open('temp.txt', 'r'):
    seen = []
    words = line.rstrip('\n').split()

    for word in words:
        if not word in seen:
            print word,
            seen.append(word)
    print

Output:

# cat temp.txt
pink_kite red_pen ball pink_kite ball
yellow_flower white no white no
cloud nine_pen pink cloud pink nine_pen
brown_ball white
red_bear green red_bear
white no

# python temp.py
pink_kite red_pen ball
yellow_flower white no
cloud nine_pen pink
brown_ball white
red_bear green
white no

summer_cherry · March 19, 2009, 12:04am

hi perl shoudl be easy.

But you may try below awk

nawk '
function re_dup(arr,n)
{
	for(i=1;i<num;i++){
		for(j=i+1;j<=num;j++){
			if (arr==arr[j])
				arr[j]=""
		}
	}
}
{
	num=split($0,arr," ")
	re_dup(arr,num)
	for(i=1;i<=num;i++){
		if(arr!="")
			printf("%s ",arr)
	}
	printf "\n"
}' filename

sam_2921 · March 19, 2009, 9:13am

Thanks summer_cherry, ShawnMilo and Rubin.

The nawk and Python codes are running perfect,

but Rubin the awk one liner is giving the error " a[: Event not found. " can u please guide why this error is coming?

Thanks again.
Sam

sam_2921 · March 19, 2009, 9:13am

Thanks summer_cherry, ShawnMilo and Rubin.

The nawk and Python codes are running perfect,

but Rubin the awk one liner is giving the error " a[: Event not found. " can u please guide why this error is coming?

Thanks again.
Sam

rubin · March 19, 2009, 5:52pm

I cannot reproduce the same error, obviously use nawk or /usr/xpg4/bin/awk on Solaris. The code works fine either on Solaris or Linux with no error messages.
HTH.