Use grep/awk to remove part of column

owwow14 · November 18, 2013, 3:40am

hi all,
how can i use

grep

or

awk

to clean the following input data:

n<>the<>96427210 861521305 123257583 
n<>obj<>79634223 861521305 79634223 
n<>nmod<>68404733 861521305 68422718

where the desired results is to remove all non-numeric characters?:

96427210 861521305 123257583 
79634223 861521305 79634223 
68404733 861521305 68422718

pamu · November 18, 2013, 3:43am

With some assumptions..

$ awk -F ">" '{ print $NF}' file

96427210 861521305 123257583
79634223 861521305 79634223
68404733 861521305 68422718

Subbeh · November 18, 2013, 3:50am

if you want to remove all non-numeric data (excluding spaces) try this:

sed 's/[^0-9 ]//g' file

pravin27 · November 18, 2013, 7:53am

Perl

perl -pe 's/[^0-9\s]//g' filename

itkamaraj · November 18, 2013, 8:10am

$ awk 'gsub("[^0-9 ]","")' a.txt
96427210 861521305 123257583 
79634223 861521305 79634223 
68404733 861521305 68422718

Akshay_Hegde · November 18, 2013, 8:10am

One more way to get expected output

$ cat <<eof | awk 'gsub(/[[:alpha:]]|[[:punct:]]/,x)'
n<>the<>96427210 861521305 123257583 
n<>obj<>79634223 861521305 79634223 
n<>nmod<>68404733 861521305 68422718
eof

96427210 861521305 123257583 
79634223 861521305 79634223 
68404733 861521305 68422718

for file use like this

$ awk 'gsub(/[[:alpha:]]|[[:punct:]]/,x)' file

OR

$ tr -d '[[:alpha:]]|[[:punct:]]' <file

OR

$ tr -cd '[:digit:]  \n' <file