Get a none duplicate list file

trynew · June 14, 2002, 10:54am

Dear sir

     i got a file like following format with the duplicate line:

       AAA 
       AAA 
       AAA 
       AAA 
       AAA 
       BBB
       BBB
       BBB
       BBB
       CCC
       CCC
       CCC
       CCC

   can help me to shorten the file size by one record apear only once in the file?    Thanks in advance!!!

usfrog · June 14, 2002, 11:24am

sort will do (look at the man page for more options):

cat myFile | sort -u > myFileNoDup # create 2nd file
mv myFileNoDup myFile # replace first file

trynew · June 15, 2002, 12:23pm

Thank you very much!!!

peter.herlihy · June 20, 2002, 1:47am

You can just use the syntax

sort -u myfile -o myfile

Cat is redundant in the suggested statement above and the sort command allows output to the input filename (unlike sed etc.)

Do you wish to retain the order in the file....and just remove duplicates? Or do you want to sort and remove duplicates?

sort -u -m myfile -o myfile

Will remove only duplicates that appear directly next door. It assumes the list is already sorted and looks for two rows next to each other that are the same. So a list of

B
B
A
A
B
B
A
A

Will come back as

B
A
B
A

------------

yeheyaansari · June 20, 2002, 9:57am

You can use the uniq command for the same

uniq filename

Thanks
Yeheya

Nisha · June 25, 2002, 12:06am

Hi,

The -u option in sort also stands for unique. It suppresses all the duplicate keys except one. There is another option -c which checks whether the single input file is sorted.

-Nisha