grep or awk problem, unable to extract numbers

Hi, I've trouble getting some numbers from a html-file. The thing is that I have several html-logs that contains lines like this:

nerdnerd, how_old_r_u:45782<br>APPLY: <hour_second> Verification succeded

This is some of what I've extracted from a html file but all I really want is the number in the middle. When using awk I get:

how_old_r_u:45782<br>APPLY:

since there is a space at each end, like a separator for awk.

And I tried using grep "[0-9]" but it only takes the whole line containing the number so I get the whole line again. Is there any command that can retreive the numbers only?

The pattern is not very clear. But you can try

grep -oE "[[:digit:]]{1,}" input.txt

If that does not satisfy your requirement, perhaps this.

sed -n -e "s/.*:\([0-9]*\).*/\1/p" input.txt

But if there is more numbers on that line for example:

how_old_r_u:45782<br>APPLY:[30000,t3,t4]:Plummet

It seems when I run the command

grep -oE "[[:digit:]]{1,}" input.txt

I also get the other numbers is there some way to get only 45782?

cut -f2 -d: inputfile |sed s/[^0-9]//g

Is that number composed of 5 digits only?
if YES, then you can use the awk command and you can print that substring only....

code:
cat input.txt|awk 'BEGIN {FS=":"} {print substr($2,1,5)}'

this may help.....

no need for cat.

awk 'BEGIN {FS=":"} {print substr($2,1,5)}' input.txt

Which is why the sed alternative was provided. Did you try that ? Does that give you what you are looking for ?

Give this a shot:

sed 's/[^0-9]/\ /g;s/\  */\t/g;s/^[ \t]*//;s/[ \t]*$//;/^$/d' file.txt

I'm sure there's a more elegeant way to do this, but this seems to work okay.

# Breakdown of what does what

#1. The "sed" command itself
sed

#2. Replace everything but numbers with a space globally
's/[^0-9]/\ /g;

#3. Substitute a single tab for multiple spaces globally
s/\ */\t/g;

#4. Remove all leading and trailing white space
s/^[ \t]//;s/[ \t]$//;

#5. Delete all blank lines
/^$/d'

#6. The file to be processed.
file.txt