Grep spacing issue

Hi,

I'm trying to match html strings using grep commend and I'm not able to eliminate the extra spaces. This is working when there is no space. Can anyone help me what am I missing here?
I'm able to match KayDee where there is no space.
grep commands tried:

grep -P -o -e '(?<=<span>[^ ]*<span>[^ ]*)(.*?)(?<=<\/span>)' htmlpage
grep -P -o -e '(?<=<span>[[:space:]]*<span>[[:space:]]*)(.*?)(?<=<\/span>)' htmlpage

HTML code :
With space :

<span>

<span>

KayDee

</span>

Without space :

<span><span>KayDee</span>

Can you paste in some sample input? (in CODE tags) and which command gives you which output please?

It might be that it looks like a space to a human but it is something else however the [^ ] might actually be saying "not space" as the next character after the <span> literal text you have so your serach would ignore it.

I might have missed the point, but I'm sure someone here can help.

Kind regards,
Robin

1 Like