MERGE FILES

Hi all!

How could I merge all the text files (in format xml) of a single folder, after having deleted from each of them all the text from its beginning up to a specific string: "<body>" ?

Thanks a lot!

mjomba

Do you mean this?

for i in ls *.xml
do
awk '/<body>/{p=1;next}p' $i >> mergefile
done
1 Like

Dear yinyuemi,
thank you very much for the code you wrote for me. It works well!
Could you explain the meaning of it, and me how does it work?

Moreover, could it be possible to add the filename as the first line in every file, after deleting that part of text, and before merging it tho the whole mergefile?

Thank you again!

mjomba

Any reason to use for loop?

awk '/<body>/{p=1;next}p' *.xml

Hi mjomba,
Do you mean this?

for i in ls *.xml
do
awk '/<body>/{p=1;print FILENAME;next}p' $i >> mergefile
done
awk '/<body>/{p=1;print FILENAME;next}p' ## means when match patten "/<body>/", make a print tag "p=1",meanwhle print "FILENAME",next, print the text with "p=1"

Sorry for my bad English!

Best,

Y