Script to find & replace a multiple lines string across multiple php files and subdirectories

Hey guys. I know pratically 0 about Linux, so could anyone please give me instructions on how to accomplish this ?

The distro is RedHat 4.1.2 and i need to find and replace a multiple lines string in several php files across subdirectories.

So lets say im at root/dir1/dir2/ , when i execute the script it should search and replace a piece of code on php files inside this directory and its subdirectories.

The closest i found i think it was this:

Paste this into a file �renall� and make it executable (chmod u+x renall): #!/bin/sh
 if [ $# -lt 3 ] ; then
   echo -e �Wrong number of parameters.�
   echo -e �Usage:�
   echo -e �  renall �filepat� findstring replacestring\n�
   exit 1
fi
 #echo $1 $2 $3
for i in `find . -name �$1 -exec grep -l �$2 {} \;`
do
mv �$i� �$i.sedsave�
sed �s/$2/$3/g� �$i.sedsave� > �$i�
echo $i
#rm �$i.sedsave�
done

But the string i need to find and replace have multiple lines, so i dont know how to put it in this script.

Also i dont know how to save and execute this script :o

EDIT: AH and by the way, i need to only DELETE the piece of code from these PHP files, i dont need to replace it with something else...

Could anyone please give me detailed instructions on this ?

Thanks in advance!

For starters, I'd rewrite the script you posted. The grep isn't needed, and since you're looking for a fixed set of files (*.php I assume), and fixed replace criteria, you don't need to worry about command line parameters. (You should if this is more than a one-off script, but I'll assume you don't need a script that you can run with different criteria.)

The basic script:

#!/usr/bin/env ksh

cd /directory/path/where/you/want/to/start
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up

    ### insert sed or awk here ####  the sed in next line is for illustration only 
    sed 's/nosuchstringinthefile/noscuhreplacement/' "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked, delete backup 
    fi
done

The tricky question is what criteria you need to determine which lines of the files to delete. Depending on what it is, the 'insert sed here' line in the previous script will need to change to do the right thing.

If your block of code is bounded by a unique string or phrase in the first and last lines then it will be a simple sed. The more complicated the criteria, the more complicated the code will need to be. Bottom line: post your criteria for deletion and someone will give you some help with writing a sed/awk or something that will delete things.

As for creating a script...
Start your favorite text editor, insert the code from above, or other of your choice, and save the file. Then at the command line, enter this command (assumes the file you saved was called delete_frm_php.ksh):

chmod 755 delete_frm_php.ksh

You then can invoke your script from the command line by typing delete_frm_php.ksh

Hi, first thanks a lot for the help agama, appreciate it!

The php code i want to remove is a sequence of lines, it is not spread accross random lines. Its a piece of code that have around 25 lines.

Example (thats not the actual code i want to remove, just something similar):

  <?php
  $sql = "SELECT * FROM articles WHERE id = '".$_GET['article']."'";
  $do->doQuery($sql);
  $article = $do->getRows();
  if(isset($_POST['add'])) {
    if(trim($_POST['nick']) != '') {
      $nick = trim($_POST['nick']);
    } else {
      $errorX['nick'] = 'Please enter your nickname.';
    }
    if(trim($_POST['comment']) != '') {
      $comment = trim($_POST['comment']);
    } else {
      $errorX['comment'] = 'Please enter a comment.';
    }
    if(empty($errorX)) {
      $sql = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".$_POST['website']."','".$_GET['article']."','".$nick."','".$comment."','".$email."')";
      $do->doQuery($sql);
      header('Location: '.$_SERVER['HTTP_REFERER']);
    }
  }
  ?>

Like i said, i just need to find and remove all this code from multiple php files, i dont need to replace with anything...

So how do i put this php code in the above script ?

Thanks!

Is the block of code the only block that starts <?php and finishes ?> ? I suspect that maybe there are other blocks that start and end this way, but on the off chance that this will be the only block like this, then this sed should work:

sed '/<?php/,/?>/d'  "$file-" >"$file"

It deletes all lines between the starting line with "<?php" and the ending "?>" line as it reads the file. The updated file is written to $file.

If you can use this sed, just replace it in the earlier example.

If there are more than one php blocks of code, then you'll need to find a unique string inside the block that you want to delete. Change the one line in the script below that has "/enter your nickname/" to contain the unique string from the block of code and it should find and delete the lines containing the string.

#!/usr/bin/env ksh

cd /directory/path/where/you/want/to/start
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up
    awk '     # read the file and delete the block of php code
    /<?php/ { drop = idx = 0; snarf = 1; }  # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer );
            printf( "%s\n", $0 );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    ### change the string between the slants to be something unique to the block you wish to delete. 
    /enter your nickname/ { drop = 1; }    # magic string found, drop if we are in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        next;
    }

    { print; }              # not buffering, just print the record.
    '  "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked, delete backup 
    fi
done

Hope this helps get you going.

Hi. When i execute the script it keeps saying:

No such file or directory cd: /directory/path/where/you/want/to/start (i did replaced this with the full path of the directory i wanted it to start)

BTW, when i execute it just by typing

delete_frm_php.ksh 

it says :

-bash Command not found: delete_frm_php.ksh: Command not found

when i execute it typing

bash delete_frm_php.ksh

it says:

: command not found line 2:
: No such file or directory cd: /directory/path/where/you/want/to/start
delete_frm_php.ksh: line 32: unexpected EOF while looking for matching `''
delete_frm_php.ksh: line 41: syntax error: unexpected end of file

I also a few tweaks in the path, no success

Any clues ? Thank you!

If you cut and paste the cd command with your path at the command line do you get the same error? Screams typo in the path to me, but it is hard to say from here.

You'll need to make the script executable. You can change the file's mode to turn on the executable bit with the chmod command. This should work:

chmod 755 delete_frm_php.ksh

If you've done that then I'm a bit confused. And if you want bash to execute the script, rather than ksh, then change the ksh reference in the first line to bash.

Without seeing the changes you've made this is impossible to give any suggestions for. Sounds like you've got a missing quote (check out line 32 to start with). You can post the whole script if you cannot find the missing quote.

Here it is:

#!/usr/bin/env ksh

cd /home/username/public_html/tests/
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up
    awk '     # read the file and delete the block of php code
    /<?php/ { drop = idx = 0; snarf = 1; }  # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer );
            printf( "%s\n", $0 );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    ### change the string between the slants to be something unique to the block you wish to delete. 
    /PHP_STRING/ { drop = 1; }    # magic string found, drop if we're in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        next;
    }

    { print; }              # not buffering, just print the record.
    '  "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked, delete backup 
    fi
done

Shouldnt the first line be:

#!/bin/ksh

?? Although i tested this and didnt worked aswel...

Grumble!! I added comments as I posted it and my correct spelling introduced the bug. Change the comment

# magic string found, drop if we're in a php block

so that it doesn't have the single quote (') (make ""we're" "we are" or somesuch).

As for the cd command...
Does the directory "/home/username/" exist, or should username be the name of the user running the script, or did you change it to 'username' so as not to post the real name here? If you really have /home/username, then change username to $USER, or the real name.

There are two schools of thought on the "#!" line. My school of thought is to use #!/usr/bin/env with the parameter ksh, bash, etc. This allows the shell/interpreter to be found using my PATH, and not the hard coded /usr/bin/ksh or whatever is coded. The advantage is that when I have a group of scripts that need to be tested with a particular version of the shell/interpreter, I need only to set up my PATH correctly and the proper version of the interpreter will be invoked for every one of them. I don't need to modify any script to point to the version I am testing under, nor do I need to install the new/beta/old version of the interpreter in /usr/bin or where ever.

The other side of the coin is to hard code the path to the interpreter as you have pointed out. It works, but it is limited in my opnion.

I wrote "username" just to hide the real username, im using the real username instead of it, dont worry.

Ok, heres the script i used:

#!/usr/bin/env ksh
cd /home/username/public_html/tests/
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up
    awk '     # read the file and delete the block of php code
    /<?php/ { drop = idx = 0; snarf = 1; }  # start of a block start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer );
            printf( "%s\n", $0 );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    ### change the string between the slants to be something unique to the block you wish to delete.
    /PHP_UNIQUE_CODE/ { drop = 1; }    # magic string found drop if we are in a php block

    snarf {                 # if buffering hold the record until end of block reached.
        buffer[idx++] = $0;
        next;
    }

    { print; }              # not buffering just print the record.
    '  "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked delete backup
    fi
done

(my script filename is "newdelete2.ksh" ) Still when i execute it just by typing

newdelete2.ksh

it does not work (same problem as before) BUT i tried this

/newdelete2.ksh

and this

./newdelete2.ksh

and they both return the following:

[/]# /newdelete2.ksh
munging: ./footer.php
munging: ./home.php
munging: ./index.php

I've made a "test" directory (as you can see on the "cd /path/") with those 3 php files in it. footer, home and index.

Unfortunatly nothing happens, the php code was not removed from those files, no code was removed.

Any more ideas ? Thank you! Already helped too much :slight_smile:

---------- Post updated 03-05-12 at 11:57 AM ---------- Previous update was 03-04-12 at 01:32 PM ----------

Just found out something, when the PHP block i want to delete is in a position like the one below (notice the "<?php get_header(); ?>" that comes before and on the same line as the "<?php"):

<?php get_header(); ?><?php
  $sql = "SELECT * FROM articles WHERE id = '".$_GET['article']."'";
  $do->doQuery($sql);
  $article = $do->getRows();
  if(isset($_POST['add'])) {
    if(trim($_POST['nick']) != '') {
      $nick = trim($_POST['nick']);
    } else {
      $errorX['nick'] = 'Please enter your nickname.';
    }
    if(trim($_POST['comment']) != '') {
      $comment = trim($_POST['comment']);
    } else {
      $errorX['comment'] = 'Please enter a comment.';
    }
    if(empty($errorX)) {
      $sql = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".$_POST['website']."','".$_GET['article']."','".$nick."','".$comment."','".$email."')";
      $do->doQuery($sql);
      header('Location: '.$_SERVER['HTTP_REFERER']);
    }
  }
  ?>

it will only delete the "?>" at the end of the code, and leave the rest of the code intact.

And there are a lot of instances in which the PHP block appears in that position (with the "<?php" appearing in front and right next to a random piece of code, without space between them)

So i edited the PHP test files, and placed the PHP block exactly like the one below:

<?php
  $sql = "SELECT * FROM articles WHERE id = '".$_GET['article']."'";
  $do->doQuery($sql);
  $article = $do->getRows();
  if(isset($_POST['add'])) {
    if(trim($_POST['nick']) != '') {
      $nick = trim($_POST['nick']);
    } else {
      $errorX['nick'] = 'Please enter your nickname.';
    }
    if(trim($_POST['comment']) != '') {
      $comment = trim($_POST['comment']);
    } else {
      $errorX['comment'] = 'Please enter a comment.';
    }
    if(empty($errorX)) {
      $sql = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".$_POST['website']."','".$_GET['article']."','".$nick."','".$comment."','".$email."')";
      $do->doQuery($sql);
      header('Location: '.$_SERVER['HTTP_REFERER']);
    }
  }
 ?>

and it worked, the whole PHP block was removed.

But now, how to make it work when the "<?php" is on the same line, and right next to a random piece of code ?

Thank you!

I figured you dummied in 'username', but I've also learned not to assume!

That's an interesting twist, and here is some revised code that should do the trick. You'll need to supply the unique string in the BEGIN block as it's needed twice; On the off chance that it happens, if the unique string appears inside of an open/close pair that are on the same line, that will be removed.

awk '
    BEGIN { unique_str = "unique thing"; }          # stick in the unique string here

    {
        partial = 0;
        while( match( $0, "<\\?php [^\\?]*\\?>" ) > 0 )         # complete beginning/ending on the line
        {
            if( index( substr( $0, RSTART, RLENGTH ), unique_str ) )        # if it contains the magic string
                printf( "%s", substr( $0, 1, RSTART-1 ) );      # print evrything before it, and skip it
            else
                printf( "%s", substr( $0, 1, RSTART + (RLENGTH-1) ) );      # print everything including the begin and end  (edited)

            $0 = substr( $0, RSTART + RLENGTH  );

            partial = 1;
        }

    }

    /<?php$/ { drop = idx = 0; snarf = 1; } # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer );
            printf( "%s\n", $0 );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    match( $0, unique_str ) { drop = 1; }   # magic string found, drop if were in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        if( partial )
            printf( "\n" );
        next;
    }

    { print; }              # not buffering, just print the record.

' 

Hope this works better for you.

As for needing ./scriptname to execute your script, that implies that the current directory is not in PATH. You can add '.' to your PATH or just type the additional './' at the front.

---------- Post updated at 22:33 ---------- Previous update was at 22:10 ----------

Small revision. I realised that if something like this occurs

some text before block opening tag<?php

and the block is dropped, the text before the opening tag is also dropped. This code fixes that bug:

awk '
    BEGIN { unique_str = "unique thing"; }          # stick in the unique string here

    {
        while( match( $0, "<\\?php [^\\?]*\\?>" ) > 0 )     # complete beginning/ending on the line
        {
            if( index( substr( $0, RSTART, RLENGTH ), unique_str ) )        # if it contains the magic string
                printf( "%s", substr( $0, 1, RSTART - 1 ) );        # print evrything before it, and skip it
            else
                printf( "%s", substr( $0, 1, RSTART + (RLENGTH-1) ) );      # print everything including the begin and end

            $0 = substr( $0, RSTART + RLENGTH  );  
        }

    }

    /<?php$/ { drop = idx = 0; snarf = 1; } # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer );
            printf( "%s\n", $0 );
        }
        else
        {
            if( (i = index( buffer[0], "<?php" )) > 0 )    # if something before <?php, and we dropped the block, print the leading text
                printf( "%s\n", substr( buffer[0], 1, i-1 ) );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    match( $0, unique_str ) { drop = 1; }   # magic string found, drop if were in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        if( partial )
            printf( "\n" );
        next;
    }

    { print; }              # not buffering, just print the record.
'

Wait, this last script you sent is the full thing ? Or do i need to replace that part ( from "awk" till ' ) in the original script im using ?

I'll be trying it now tho, but please let me know.

Thanks !

EDIT : Replaced the new "awk" code in the script, unfortunatly it didnt work. Same thing happened, only the string that closes the PHP block was moved ( ?> ) , the rest of the PHP block was intact.

Even for the PHP block that completly isolated from other pieces of code, it only removed the ?>

Script i used was:

#!/usr/bin/env ksh
cd /home/username/public_html/tests/
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up
    awk '
    BEGIN { unique_str = "UNIQUE CODE"; }          # stick in the unique string here

    {
        while( match( $0, "<\\?php [^\\?]*\\?>" ) > 0 )     # complete beginning/ending on the line
        {
            if( index( substr( $0, RSTART, RLENGTH ), unique_str ) )        # if it contains the magic string
                printf( "%s", substr( $0, 1, RSTART - 1 ) );        # print evrything before it, and skip it
            else
                printf( "%s", substr( $0, 1, RSTART + (RLENGTH-1) ) );      # print everything including the begin and end

            $0 = substr( $0, RSTART + RLENGTH  );
        }

    }

    /<?php$/ { drop = idx = 0; snarf = 1; } # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer );
            printf( "%s\n", $0 );
        }
        else
        {
            if( (i = index( buffer[0], "<?php" )) > 0 )    # if something before <?php, and we dropped the block, print the leading text
                printf( "%s\n", substr( buffer[0], 1, i-1 ) );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    match( $0, unique_str ) { drop = 1; }   # magic string found, drop if were in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        if( partial )
            printf( "\n" );
        next;
    }

    { print; }              # not buffering, just print the record.
    '  "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked delete backup
    fi
done

Sorry for the confusion, yes I only pasted the awk portion figuring you could insert that into your script body.

Very strange. I took this to a different machine (FreeBSD) just to see if a different flavour of awk might barf and its only compplaint was to escape the question mark in the following line (new character in red):

/\?>/ {                  # end of a block

The Gnu awk on my Linux host wasn't complaining about that. What version of awk do you have installed?

awk --version

should give that to you. I've been testing this with GNU Awk 3.1.6.

To test a bit further....
I've cut/pasted the test file I'm using and it doesn't have any issues. I took the awk straight from your post just to be sure and used the dummy "UNIQUE STRING" as well. The result, when I execute it, is the middle section is dropped.

 <?php
   = "SELECT * FROM articles WHERE id = '".['article']."'";
  ->doQuery();
   = ->getRows();
  if(isset(['add'])) {
    if(trim(['nick']) != '') {
       = trim(['nick']);
    } else {
      ['nick'] = 'Please enter your nickname.';
    }
    if(trim(['comment']) != '') {
       = trim(['comment']);
    } else {
      ['comment'] = 'Please enter a comment.';
    }
    if(empty()) {
       = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".['website']."','".['article']."','".."','".."','".."')";
      ->doQuery();
      header('Location: '.['HTTP_REFERER']);
    }
  }
  ?>
<?php get_header(); ?><?php
   = "SELECT * FROM articles WHERE id = '".['article']."'";
  ->doQuery();
   = ->getRows();
  if(isset(['add'])) {
    if(trim(['nick']) != '') {
       = trim(['nick']);
    } else {
      ['nick'] = 'Please enter your nickname.';
    }
    if(trim(['comment']) != '') {
       = trim(['comment']);
    } else {  //"UNIQUE CODE"
      ['comment'] = 'Please enter a comment.';
    }
    if(empty()) {
       = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".['website']."','".['article']."','".."','".."','".."')";
      ->doQuery();
      header('Location: '.['HTTP_REFERER']);
    }
  }
  ?>
  <?php
   = "SELECT * FROM articles WHERE id = '".['article']."'";
  ->doQuery();
   = ->getRows();
  if(isset(['add'])) {
    if(trim(['nick']) != '') {
       = trim(['nick']);
    } else {
      ['nick'] = 'Please enter their nickname.';
    }
    if(trim(['comment']) != '') {
       = trim(['comment']);
    } else {
      ['comment'] = 'Please enter a comment.';
    }
    if(empty()) {
       = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".['website']."','".['article']."','".."','".."','".."')";
      ->doQuery();
      header('Location: '.['HTTP_REFERER']);
    }
  }
  ?>

What happens if you save just the awk in a file (lets say test_awk), the data in test_data and try this:

ksh test_awk <test_data

Hi, the AWK version is:

GNU Awk 3.1.5

But here's the thing, i tested with these same 3 PHP codes you were testing, and it worked. So the problem is probably with the PHP block im trying to remove.

I will PM you this PHP block so you can test with it, ok ?

Please check your PM box.

---------- Post updated at 09:05 AM ---------- Previous update was at 08:35 AM ----------

Just found out you have PM's disabled. Could you enable it for a second ?

I really can't post this PHP code in here. Let me know!