extract numbers from a word

systemali · March 20, 2006, 5:21am

Hi ppl,

I am a bit lost on this...can some one assist. I know this can be down with awk or sed, but i cant get the exact syntax right.

I need to only extract the numbers from a signle word ( eg abcd.123.xyz )

How can i extract 123 only ?

Thanks

new2ss · March 20, 2006, 5:39am

Hi,
i am assuming u have a file containing the strings and each string is uniform as below:

ccc.nnn.ccc #where c is alphabet, n is number

nawk -F. '{print $2}' filename
or
$number = `echo line | nawk -F. '{print $2}' ` #not sure if this will work
echo $number

Ygor · March 20, 2006, 5:40am

Try...

echo "abcd.123.xyz" | tr -dc '[0-9]'

systemali · March 20, 2006, 5:45am

Hi Yogr,

Thank you so much, That worked Thank u so much.

systemali · March 20, 2006, 5:54am

But there is a problem

What if the file name is "ab42.1.2.3.tar.gz"

I only need to extract "1.2.3", but as of now it is reading "42.123" any help on that?

Thanks

matrixmadhan · March 20, 2006, 8:09am

try this,

echo ab42.1.2.3.tar.gz | sed -e 's/^[a-z]*[0-9]*.//;s/.tar.gz//'

systemali · March 20, 2006, 9:09am

that worked like a gem. Thank u so much.

cgkmal · March 26, 2009, 2:34am

Hi all,

I have many lines with several date and hour format within each line and I need to extract only the dates and hours for each line
In the same order.

Example:

Today 090326 she left the office at 20:00, and tomorrow 20090327 will come at 08:00-->Result=090326 20:00 20090327 08:00
I estudied yesterday 25/03/2009, 1524 and I�ll continue tomorrow 27032009 at 17:30-->Result =25/03/2009 1524 27032009 17:30

If I use grep with something like

grep [0-9] [0-9] [0-9] [0-9]2009 infile
or
grep 09[0-9] [0-9] [0-9] [0-9] [0-9] infile

I don't get only the dates and hours, only some lines that match with that.

May somebody help me doing a filter in order to extract dates and hours in desired format?

Many thanks in advance

rikxik · March 26, 2009, 4:00am

$ perl -ne 's/[^\: \/\d\n]//g; print "Result=" . join(" ", split), "\n"' file
Result=090326 20:00 20090327 08:00
Result=25/03/2009 1524 27032009 17:30

Shahul · March 26, 2009, 7:20am

HI,

$ echo "Today 090326 she left the office at 20:00, and tomorrow 20090327 will come at 08:00" |sed 's/[a-z]//g'|sed 's/[A-Z]//g'
 090326      20:00,   20090327    08:00

Thanks
SHa

cgkmal · March 26, 2009, 12:22pm

Hi Shahul and rikxik,

I�ve tryed your solutions and works very nice, only some little thing to fix, because I forget that are some lines that contains some other numbers that aren�t of interest for me, for example:

If the text is like
Day_01 090326 she left the office at 20:00, and tomorrow 20090327 will come at 08:00

your codes show 01 090326 20:00, 20090327 08:00

How to eliminate the numbers of 2 or 1 digit like in the word "Day_01" to obtain only dates and hours?

Day_01 090326 she left the office at 20:00, and tomorrow 20090327 will come at 08:00 -->Result 090326 20:00 20090327 08:00. (without 01)

I�ve tryed a little change follow the sed example with

cat file sed s/[0-9][0-9]//g

but this eliminates all numbers, including dates and hours.

Thanks again.

joeyg · March 26, 2009, 12:49pm

> sample="Day_01 090326 she left the office at 20:00, and tomorrow 20090327 will come at 08:00 "
> sample2=`echo $sample | tr " " "\n" | egrep -e "[0-9][0-9][0-9:]" | tr "\n" " " | tr -d ","`
> echo $sample2
090326 20:00 20090327 08:00

Shahul · March 27, 2009, 6:12am

Hope this also can do that..

echo "Day_01 090326 she left the office at 20:00, and tomorrow 20090327 will come at 08:00"|sed 's/[a-z]//g'|sed 's/[A-Z]_\([0-9]\{2\}\)//g'
 090326      20:00,   20090327    08:00

Thanks
Sha

cgkmal · March 28, 2009, 5:42pm

Thanks for your help, I�m still receiving some number (one or two digits) that I�d like to delete (in red below)

02 2 25032009 1739 25032009 1715 25032009 40
02 2 25032009 1715 25032009 1737
01 25032009 1635 25032009 1732
51 1 13 01 1 25032009 1700 25032009 1642 25032009 1709
51 1 13 01 1 25032009 1642 25032009 1706

I�ve tryed with

 
cat Infile | sed 's/[A-Za-z:_,-.=()\/]//g'

Please an advice with the correct regex to get this, eliminate the patterns of one or 2 digits only and to sort in columns the result.

Best regards

Shahul · March 30, 2009, 3:47am

Hi,

Hope this should work..

sed 's/_\([0-9]\{2\}\)//g;s/ \([0-9]\{1\}\)//g;s/[a-z]//g;s/[A-Z]//g;s/,//g' inputfile

Thanks
SHa