Sed or trim to remove non alphanumeric and alpha characters?

Hi All,

I am new to Unix and trying to run some scripting on a linux box. I am trying to remove the non alphanumeric characters and alpha characters from the following line.

<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>

Desired output is:
883250 869.898 86432.4 809875.22 804609 60023 59715

I dont know much about sed so I used the following codes. a="<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>"
b=${a//[^0-9]/ }
set -- $b
echo $1 $2 $3 $4 $5 $6 $7.....

It returns the result but it split the decimal point and break it to another value.
883250 869 898 86432 4 809875 22 804609 60023 59715

Next, I tried using trim.
a="<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>"
echo $a | tr -d '[:alpha:]'

It returns the following result but how do I get rid of the <> and </>
<>883250 869.898 86432.4 809875.22 804609 60023 59715 </>

My mate told me I can easily use sed to remove the words and get my desired output but I have no clue about sed. Spent some time looking the tutorial but couldnt get the syntax right. too many /\ \/ \/ /\ /\ in sed which looks very confusing.

Any help would be appreciated.
Cheers
jack

# echo $x | perl -e '$y=<>; $y=~s/<\/?.*?>//g; print $y'
883250 869.898 86432.4 809875.22 804609 60023 59715
1 Like

This may do the trick:

sed 's/<[^>]*>//g'  input-file >output

It replaces all characters between < and > including the greater/lessthan symbols.

1 Like

@jackma: Also, you were pretty close with tr. Just a small extension and you would've got what you wanted:

echo $x | tr -d '[:alpha:]' | sed 's/[<>]//g'
1 Like

Thanks all. Got it...

I will use the trim one and pipe it to sed....as it looks easier for me to understand. :-p

a="<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>"
echo $a | tr -d '[:alpha:]' | sed 's/[</>]//g'

883250 869.898 86432.4 809875.22 804609 60023 59715

Cheers,
Thanks again, all.

---------- Post updated at 05:27 PM ---------- Previous update was at 05:23 PM ----------

Thanks agama. This works...with 1 line...lol thanks mate

Or with just "tr". List the characters you want to keep and use the "complement" function.

cat filename | tr -cd '[0-9]. \n'

883250 869.898 86432.4 809875.22 804609 60023 59715
1 Like

you guys are awesome!