Help with extract particular part of data

Input file

<data>
<temporary>qe2qrqerq
qwewqeqwrqwrq
qrerwrewrwer
</temporary>
</data>
<sample>@!@##%#</sample>
<info>12345</info>
<content>2313214141454</content>
<data>
<temporary>qe2qrqerq
qrerwrewrwer
</temporary>
<content>123214214523</content>
</data>
<sample>@!@##%#</sample>
.
.

Desired output file

<data>
<temporary>qe2qrqerq
qwewqeqwrqwrq
qrerwrewrwer
</temporary>
</data>
<data>
<temporary>qe2qrqerq
qrerwrewrwer
</temporary>
<content>123214214523</content>
</data>
.
.

I would like only extract those info in between ""<data> and </data>"
Thanks for any advice.

What have you tried so far?

Hi zaxxon,

Below is the way that I tried:

[home@perl]grep -v 'sample' input_file.txt | grep -v 'info' | grep -v 'content' > output_file.txt

But it seems like not a smart way to do that :frowning:

perl -ne '(/<data>/../<\/data>/)&&print $_' inputfile.txt
1 Like

With awk:

awk '/^<data>/ {print; f++; next} /^<\/data>/ {print; f--} f' infile
1 Like

Another one with Perl -

$
$ cat f43
<data>
<temporary>qe2qrqerq
qwewqeqwrqwrq
qrerwrewrwer
</temporary>
</data>
<sample>@!@##%#</sample>
<info>12345</info>
<content>2313214141454</content>
<data>
<temporary>qe2qrqerq
qrerwrewrwer
</temporary>
<content>123214214523</content>
</data>
<sample>@!@##%#</sample>
$
$
$ perl -0lnE 'say $1 while(/(<data>.*?<\/data>)/sg)' f43
<data>
<temporary>qe2qrqerq
qwewqeqwrqwrq
qrerwrewrwer
</temporary>
</data>
<data>
<temporary>qe2qrqerq
qrerwrewrwer
</temporary>
<content>123214214523</content>
</data>
$
$

tyler_durden

1 Like