I am looking for a sed (or other command) that will find this string and remove it..
Some code I have tried:
grep -rl 646f63756d6 * | xargs sed -i '/<script/,/<\/script>/d'
That one is real close.. But, it removes everything after <script... So, if there is any data after the </script>, it gets removed too.. So, I think it just needs some tweaking...
Thanks for any help!
David
---------- Post updated at 12:50 AM ---------- Previous update was at 12:00 AM ----------
Hello,
I see that my first code won't work, that is meant for deleting a complete line, but sometimes there is valid code at the beginning of the line (like </html>).
So, I think I need the properly formatted find/replace command (s///). I can't seem to come up with the proper find piece.
Had another situation occur that was about the same as this one, except the javascript is across multiple lines (vs all on the same line like my first issue). I did get those files cleaned up, thanks again rangarasan.
This one goes more like this:
<script>^M
function vdch () {^M
..
..
..
</script>
each line ends with the ^M character, so that will probably affect the command a bit.
Is there a sed variant of the one presented that would support multiple lines of this nature?
First, if your file is in Unix/Linux, then remove those "^M" characters by feeding it to the "dos2unix" command.
dos2unix your_file
od -bc your_file
The "od -bc" command prints an octal dump of the file contents; ensure that you do **not** see any "\r" characters after the dos2unix command has been run on your file.
Thereafter, for removing multi-line <script> tags plus their content, you could do something like so:
$
$
$ cat -n f36
1 this is line 1
2 this is line 2
3 <script> the script tags
4 and their contents span
5 multiple lines
6 in this case
7 </script>
8 this is line 8
9 <script> begin and close tags on same line </script>
10 this is line 10
11 and this is <script> begin and close tags somewhere
12 within the line, i.e. not at the
13 beginning and end of
14 the line </script> line 14
15 final line number 15
$
$
$ perl -lne 'if (/^(.*?)<script>.*$/) {
print $1;
$in = 1;
}
if ($in and /^.*?<\/script>(.*?)$/) {
print $1;
$in = 0;
} elsif (not $in) {print}
' f36
this is line 1
this is line 2
this is line 8
this is line 10
and this is
line 14
final line number 15
$
$
If you could post a sample of your data file, then that should be helpful.
If all your target files have the extension ".java" and are in only one directory "my_dir", then change to that directory and run the dos2unix on all those files, like so -
cd my_dir
dos2unix *.java
If your target files are spread out in multiple directories and subdirectories thereof, then use the "find-and-exec" technique.
Change to the highest directory, i.e. a directory from where you could traverse the directory-tree and process all such java files, and then -
cd highest_dir
find . -name "*.java" -exec dos2unix {} \;
Alternatively, if you do not want to cd to any directory, and work from your current location, then you will have to specify the entire path, like so -
# fix "^M" in all java files in "/my/path/to/java/files" directory
dos2unix /my/path/to/java/files/*.java
# fix "^M" in all java files in the entire tree below "/my/highest/dir" directory
find /my/highest/dir -name "*.java" -exec dos2unix {} \;