Command to remove duplicate lines with perl,sed,awk

Input:

hello hello
hello hello
monkey
donkey
hello hello
drink
dance
drink

Output should be:

hello hello
monkey
donkey
drink
dance

Hi.

I'm sure we must have gone over this :slight_smile:

awk '!_[$0]++' file
$ ruby -00 -ne 'puts $_.split("\n").uniq' file
hello hello
monkey
donkey
drink
dance

Anybody knows the solution with sed,perl?

perl -ne 'print unless $a{$_}++' file
2 Likes

Nice.

There is no general solution for sed unless the file is sorted. If sorted, the following deletes the duplicate lines:

Hello Murphy,

Could you recommend us any page that could explain how the "N, P, D" options work in SED command?

sed '$!N; /^\(.*\)\n\1$/!P; D'

thanks in advance

man page

Actually sed has not automate ready functions for this issues..

But for me the sed is still more powerfull others..
i can try to write specific some sed for sed lovers :wink:

# cat file
hello hello
hello hello
monkey
donkey
hello hello
drink
vay
dance
drink
# ./fsed.sedv1.uniq file
hello hello
monkey
donkey
drink
vay
dance
# ## fsed-Sedv1-Uniq ##
 
#!/bin/bash
xsed="";sedarr=""
while read -r l
 do
  x=( $( echo $(sed '=' 1 | sed -n 'N;s/\n/ /;p' | sed -n "s/^\(.\).*$l/\1/p") | sed 's/ .*//') );
  xsed=("$xsed $x" )
 done <"$1"
 
fsed=( $(echo ${xsed[@]}|sed 's/ /\n/g' | sed -n '/^1/p'|sed -n '1p') )
sedarr=("$fsed" )
 
for i in ${xsed[@]}
 do
  sedarr=( "$sedarr $( echo ${xsed[@]}|sed 's/ /\n/g' | sed -ne "/^$i/p"| sed -n '1p' | sed -e "/[${sedarr[@]}]/d" )" )
 done
 
for i in ${sedarr[@]}
 do
  sed -n "$i p" "$1"
 done

Little/Big Problem Correction
But I can discover this cant process for that file has 10 or more lines.
I can try to rewrite for this problem.
lets try this..

# cat newfile
hello hello
hello hello
monkey
donkey
hello hello4
drink
dance2
dance
drink4
hello hello1
donkey2
hello hello1
hello hello2
hello hello5
donkey3
donkey2
hello hello3
hello hello3
hello hello5
monkey3
dance3
dance3
monkey3
dance3
# ./fsed.sedv2.uniq newfile
hello hello
monkey
donkey
hello hello4
drink
dance2
dance
drink4
hello hello1
donkey2
hello hello2
hello hello5
donkey3
hello hello3
monkey3
dance3
# ## fsed-Sedv2-Uniq ##
 
#!/bin/bash
xsed="" ;uniq="" ;sedarr="" ;fsed=""
while read -r l
 do
  x=( $( echo $(sed '=' 1 | sed -n 'N;s/\n/ /;p' | sed -n "s/\(.*\) \b$l\b/\1/p")  ) );
  xsed=("$xsed ${x}\b\|" )
 done <"$1"
 
fsed=( $(echo ${xsed[@]}|sed 's/ /\n/g' | sed -n '/^1/p'|sed -n '1p') )
sedar=("\b$fsed" )
 
for i in ${xsed[@]}
 do
  newi=$(echo $i | sed 's/..$//')
  sedar=( $(echo $sedar|sed 's/..$//') )
  sedax=$(echo "${xsed[@]}"|sed 's/ /\n/g' | sed -ne "/^${newi}/p"| sed -n '1p'|sed -e "/${sedar[@]}/d" )
  x=("$(echo ${sedar[@]}|sed 's/\\|/\\b&\\b/g')" )
  sedar=("${x}\|${sedax}" )
 done
 
for i in $(echo ${sedar[@]} | sed 's/[^0-9]/ /g')
 do
  sed -n "$i p" "$1"
 done
PS:there are maybe some bugs!!..I dont guaranteed works wery well(like slow results)

Regards
ygemici

---------- Post updated at 11:57 AM ---------- Previous update was at 11:54 AM ----------

This source is very usefull and very excellent for sed lovers
Thank you Bruce Barnett for this

Sed - An Introduction and Tutorial

1 Like