Hi everyone:
I want to extract string which is in between certain html tag.
e.g.
I tried with grep,cut, awk but could not find exact syntax for this one. :wall:
PS>Sorry about bad english.
Hi everyone:
I want to extract string which is in between certain html tag.
e.g.
I tried with grep,cut, awk but could not find exact syntax for this one. :wall:
PS>Sorry about bad english.
Have a go with:
sed -n 's/.*<tag>//; T; s/<\/tag>.*//; T; p' input-file >output-file
This assumes both opening and closing tags do not have a newline between them.
Or even with newlines:
awk -F\> '/^tag>/{print $2}' RS=\< infile
and if you also want to eliminate them:
awk -F\> '/^tag>/{gsub(ORS,x);print $2}' RS=\< infile
With varying tag:
awk -F\> '$0~"^"t">" {gsub(ORS,x);print $2}' RS=\< t="tag" infile
@agama note: T is GNU sed only
Hi I Got
I tried man sed could not find
Am I missing something?
sed 's/<[^>]*>/ /g'
or
grep -Po '(?<=>)\w+(?=<)'
1st Thanks to huaihaizi3 ,agama for quick responds.
Worked!!! I been trying to solve this issue for 2 hours but you did in 10 min.
Between can you care to explain code. I am hitting man awk, could not find appropriate answers.
Note: I edited the code in my post...
Yes, I seem to always forget that. A BSD sed just for completeness:
sed -n 's/.*<tag>//; !t
s/<\/tag>.*//; !t
p'
Newlines required.