Delete strings in a file

ayhanne · November 8, 2007, 11:44am

Hi,

I have a file named status.txt that looks like the file below. What I want to do is to delete the part <status> and </status> and just leave the number and print each number per line. How can I do it? If I will use sed or awk how can I do it? I tried with sed but it didn't work. Maybe I just don't know how to do it. Thanks!

expected output:
29
29
1
29
.... etc.

jim_mcnamara · November 8, 2007, 12:06pm

try:

sed 's#<status>##g' filename | sed 's#</status>##g' > newfilename

ayhanne · November 8, 2007, 12:16pm

Hi Jim,

I've tried the command that you've sent me but the output file status2.txt is empty. Please help me. Thanks a lot for taking time to answer my post.

sed 's#<status>##g' status.txt | sed 's#</status>##g' > status2.txt

Franklin52 · November 8, 2007, 12:37pm

Try:

sed 's/<status>//g' status.txt|awk 'BEGIN {RS="</status>"}{print $1}' > status2.txt

Regards

aigles · November 8, 2007, 1:00pm

Defining RS as a string doesn't work for all versions of awk.
Another awk solution :

awk -ORS='' '
{
   gsub(/<status>/, "");
   gsub(/<\/status>/, "\n" );
   print $0;
}
' input_file

Jean-Pierre.

Lakris · November 8, 2007, 2:26pm

And an alternative would be

ie, leave a character that You can replace with a field separator, since otherwise You would have output that is hard to interpret.

fpmurphy · November 8, 2007, 3:04pm

here is another solution based on the fact that the y command can take a
newline on the RHS.

sed -e 's/\<string\>//g' -e 's/\<\/string\>/~/g' -e 'y/~/\n/' infile > outfile

ayhanne · November 9, 2007, 7:58am

Hi,

Thanks for all the response and help! I will try the solutions that you've recommended. Hope it will work.

radoulov · November 9, 2007, 8:48am

GNU grep:

grep -Eo '[0-9]+' filename

GNU Awk:

awk '$1=$1'  RS="[^0-9]" filename

awk (nawk or xpg awk on Solaris):

awk '{gsub(/[^0-9]/,"")}$1=$1'  RS=">" filename