rajv
September 20, 2011, 6:59am
1
I have a file in below format (pipe delimited):
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
I want to remove any occurence of "__abc" in the second field of this file.
I did some research and found a way to replace the entire second field with another string:
sed 's/^\([^|]*\)|[^|]*|/\1|9999|/'
But I am not able to remove the "__abc" alone in the second field. Any help to do this would me much appreciated.
$ cat fil
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
Output:
$ awk -F"|" '{sub("__abc","",$2);}1' OFS="|" fil
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda|xyz
33434|Murray|xyz
Guru.
1 Like
rajv
September 20, 2011, 7:29am
3
Thanks a lot. That worked.
sk1418
September 20, 2011, 7:41am
4
well if you have already touched sed, you were very close.
sed 's/__abc//2' file
will give you what you need. I guess the missing part was the "2", right?
ctsgnb
September 20, 2011, 8:43am
5
@sk1418 :
No, your statement would remove the second "__abc" found.
So you would miss the __abc occurrence that appear in the third line because it is the first occurrence in the line.
$ cat tst
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
$ sed 's/__abc//2' tst
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
But you could go with this instead (assuming the whole file having the same formatting than the given example) :
$ cat tst
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
$ sed 's/|\(.*\)__abc|/|\1|/' tst
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda|xyz
33434|Murray|xyz
sk1418
September 20, 2011, 9:15am
6
ctsgnb:
@sk1418 :
No, your statement would remove the second "__abc" found.
So you would miss the __abc occurrence that appear in the third line because it is the first occurrence in the line.
$ cat tst
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
$ sed 's/__abc//2' tst
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
But you could go with this instead (assuming the whole file having the same formatting than the given example) :
$ cat tst
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz
33434|Murray|xyz
$ sed 's/|\(.*\)__abc|/|\1|/' tst
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda|xyz
33434|Murray|xyz
thanks for pointing this out. I didn't notice the special 3rd line.
your sed 's/|\(.*\)__abc|/|\1|/' tst works great for this example. however also not so generic.
e.g.
kent$ cat a
1234__abc|John__abc|xyz
3345__abc|Kate__abc|xyz
55344|Linda__abc|xyz__abc|xx
33434|Murray|xyz
yours:
kent$ sed 's/|\(.*\)__abc|/|\1|/' a
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda__abc|xyz|xx
33434|Murray|xyz
I made one with sed, it works, however don't know if it is the best solution with sed.
kent$ sed -r 's/\|/\x034/2;s/__abc\x034/|/;s/\x034/|/' a
1234__abc|John|xyz
3345__abc|Kate|xyz
55344|Linda|xyz__abc|xx
33434|Murray|xyz