Trying to use sed to remove the value of one field from another field

bribri87 · December 22, 2010, 7:30pm

I'm trying to use sed to remove the value of one field from another field. For example:

cat inputfile
123|ABC|Generic_Textjoe@yahoo.com|joe@yahoo.com|DEF
456|GHI|Other_recordjohn@msn.com|john@msn.com|JKL
789|MNO|No_Email_On_This_One|smith@gmail.com|PQR

I would like to remove the email address that is in the 4th field, from the text that is in the 3rd field, if it exists there (in the 3rd record there is nothing to remove).

The goal is for the output file to look like this:

123|ABC|Generic_Text|joe@yahoo.com|DEF
456|GHI|Other_record|john@msn.com|JKL
789|MNO|No_Email_On_This_One|smith@gmail.com|PQR

I thought a sed command like the following would do the trick:

sed 's/\([^|]*|[^|]*|\)\([^|]*\\3|\)\([^|]*\)\(.*\) /\1\2\3\4/g' inputfile

but it's not removing anything from the 3rd field.

Any ideas how to accomplish this task?
Thanks!

michaelrozar17 · December 23, 2010, 12:57am

you could simply go for a awk solution..Guess it's quite complex with sed to this kind of texts(anyway let's try )

awk -F"|" '{sub($4,"",$3); OFS=FS}1' inputfile > outfile

anurag.singh · December 23, 2010, 5:14am

sed 's/\(.*\)\(.*\)|\(\2|.*\)/\1|\3/' inputFile

Above awk looks simpler though.