Make python script ignore .htaccess

I just wrote a tiny script with the help of ghostdog74 to search all my files for special content phrases.

After a few modifications I now made it work, but one problem is left. The files are located in public_html folder, so there might also be .htaccess files.

So I ignored scanning of that files, but for example if there is

<FilesMatch \.php$>
deny from all
</FilesMatch>

in it, the script is also not able to scan the .php files. So my question is if it is possible to tell python or the script or the cron that starts it to ignore what .htaccess tells it.

Hopefully someone has an idea.

i don't understand what you want to do. If you want to skip .htaccess files, just give it a "if" statement

...
if filename != ".htaccess":  

otherwise, show your whole code.

The problem is not to skip the htaccess file, it is the effect that files have on python.

So here is the code for example:

#!/usr/bin/env python

import os

outfile = os.path.join("/home","user","public_html","myscanner","scans","scan_result.php")
logfile = os.path.join("/home","user","public_html","myscanner","scans","log_result.php")

datei=open(outfile,"w")
datei.close()
dateilog=open(logfile,"w")
dateilog.close()

root="/home"
for allfiles in os.listdir(root):
    if os.path.isdir(os.path.join(root, allfiles)):
    if "id" in allfiles:
        newpath = os.path.join(root,allfiles)
        for r,d,f in os.walk(newpath):
        if "public_html" in r:
                for files in f:
                if os.path.isfile(os.path.join(r,files)):
                    size=os.path.getsize(os.path.join(r,files))
                    if files.startswith("."):
                    break
                    else:
                    if size <= 2048000:
                         print files

So now for explanation, first that code should list all files within the "root" directory. Next step is to only use the folders with "id" in it and search for public_html within that directories.
Next step is to exclude all files that start with ".", that also includes .htaccess and finally there is the filter for only taking files smaller than 2048000.

That piece of code just works fine and list for example all .php files that pass all that criteria, but if in one of these folders a .htaccess with the following code in it:

<FilesMatch \.php$>
deny from all
</FilesMatch>

The scanner is not able to read the .php files. I just tested it a few times and without .htaccess he can show all files including the .php but with that htaccess in it, the .php files are not shown any more. The rest of the files, for example .html is still shown.

Hopefully you understood what I mean, because I am just starting with python, maybe there is a logical mistake within my code.

but your thread title say so.

you are opening and closing file handles at the same time. don't understand what you want to do here.

check your indentation. (are you sure it works.?)

so you want to exclude .htaccess after all?

now you confused me. the code doesn't have any part that scans the inside of the files. They just only get the filesize...

check the permission of .htaccess and the permission of the one running the python script.

import os

outfile = os.path.join("/home","user","public_html","myscanner","scans","scan_result.php")
logfile = os.path.join("/home","user","public_html","myscanner","scans","log_result.php")

#datei=open(outfile,"w")
#datei.close()
#dateilog=open(logfile,"w")
#dateilog.close()

root="/home"
for allfiles in os.listdir(root):
    if os.path.isdir(os.path.join(root, allfiles)) an "id" in allfiles:
        newpath = os.path.join(root,allfiles)
        for r,d,f in os.walk(newpath):
            if "public_html" in r:
                for files in f:
                    size=os.path.getsize(os.path.join(r,files))
                        if files.startswith("."):
                            #break <<<----------- you are breaking out of the second for loop. which is not what you want. You use "continue" here. 
                            continue
                        else:
                            if size <= 2048000:
                                print files

Thanks for the fast reply.

As always you found the problem, it was:

                        if files.startswith("."):
                            #break <<<----------- you are breaking out of the second for loop. which is not what you want. You use "continue" here. 
                            continue

The break was my logical mistake, I was searching all the time.

With the following piece of code, I just opened the scanner files at the beginning and killed the old logs in it or even if the file got lost it is created.

datei=open(outfile,"w")
datei.close()
dateilog=open(logfile,"w")
dateilog.close()

Thanks for the hint with the next lines, I just started with python and added line by line not thinking of being able to combine them. :rolleyes:

if os.path.isdir(os.path.join(root, allfiles)):    <<-------- if os.path.isdir(os.path.join(root, allfiles)) and "id" in allfiles
    if "id" in allfiles:

The next line I used because at one test I had an error, because I just copied a folder into public_html, that was a link to the mail folder. So added that code to check if the file is really a file and not a folder or link. Is there also a logical mistake?

if os.path.isfile(os.path.join(r,files)): 

Thanks again for that fast help.