Hello,
I have some data that looks like the following,
> <SALTDATA> (OVS0199262)
HCl
> <IDNUMBER> (OVS0199262)
OVS0199262
> <SUPPLIER> (OVS0199262)
TimTec
> <EMAIL> (OVS0199262)
info@timtec.net
> <WEBSITE> (OVS0199262)
http://www.timtec.net
I need to remove the data in the parentheses and the space following the final > in those lines. The value in parentheses is different in each record.
I tried sed,
sed 's/>\ \(.*\)/>/g' infile > modfile
to me, this reads, find ">" followed by 1 space, followed by open parentheses, followed by any number of any character, followed by close parentheses and replace with ">".
It seems like this should work, unless I don't have the syntax right. Instead, I am getting the output,
>
HCl
>
OVS0199262
>
TimTec
>
info@timtec.net
>
http://www.timtec.net
where I want the output,
> <SALTDATA>
HCl
> <IDNUMBER>
OVS0199262
> <SUPPLIER>
TimTec
> <EMAIL>
info@timtec.net
> <WEBSITE>
http://www.timtec.net
Here, sed seems to be matching the first greater than on the line instead of the second.
What am I missing here? I am guessing I need to escape the parentheses differently since they have their own meaning in sed.
thanks,
LMHmedchem
---------- Post updated at 06:19 PM ---------- Previous update was at 06:04 PM ----------
I found this,
sed 's/>[^>]*$/>/'
which works by removing everything after the second >. This seems to give me what I want.
I would still like to know what was wrong with my sed command above if anyone can comment.
LMHmedchem