Use grep/awk to remove part of column

hi all,
how can i use

grep

or

awk

to clean the following input data:

n<>the<>96427210 861521305 123257583 
n<>obj<>79634223 861521305 79634223 
n<>nmod<>68404733 861521305 68422718

where the desired results is to remove all non-numeric characters?:

96427210 861521305 123257583 
79634223 861521305 79634223 
68404733 861521305 68422718

With some assumptions..

$ awk -F ">" '{ print $NF}' file

96427210 861521305 123257583
79634223 861521305 79634223
68404733 861521305 68422718
1 Like

if you want to remove all non-numeric data (excluding spaces) try this:

sed 's/[^0-9 ]//g' file
1 Like

Perl

perl -pe 's/[^0-9\s]//g' filename
1 Like
$ awk 'gsub("[^0-9 ]","")' a.txt
96427210 861521305 123257583 
79634223 861521305 79634223 
68404733 861521305 68422718

1 Like

One more way to get expected output

$ cat <<eof | awk 'gsub(/[[:alpha:]]|[[:punct:]]/,x)'
n<>the<>96427210 861521305 123257583 
n<>obj<>79634223 861521305 79634223 
n<>nmod<>68404733 861521305 68422718
eof

96427210 861521305 123257583 
79634223 861521305 79634223 
68404733 861521305 68422718

for file use like this

$ awk 'gsub(/[[:alpha:]]|[[:punct:]]/,x)' file

OR

$ tr -d '[[:alpha:]]|[[:punct:]]' <file

OR

$ tr -cd '[:digit:]  \n' <file
2 Likes