The problem is not to skip the htaccess file, it is the effect that files have on python.
So here is the code for example:
#!/usr/bin/env python
import os
outfile = os.path.join("/home","user","public_html","myscanner","scans","scan_result.php")
logfile = os.path.join("/home","user","public_html","myscanner","scans","log_result.php")
datei=open(outfile,"w")
datei.close()
dateilog=open(logfile,"w")
dateilog.close()
root="/home"
for allfiles in os.listdir(root):
if os.path.isdir(os.path.join(root, allfiles)):
if "id" in allfiles:
newpath = os.path.join(root,allfiles)
for r,d,f in os.walk(newpath):
if "public_html" in r:
for files in f:
if os.path.isfile(os.path.join(r,files)):
size=os.path.getsize(os.path.join(r,files))
if files.startswith("."):
break
else:
if size <= 2048000:
print files
So now for explanation, first that code should list all files within the "root" directory. Next step is to only use the folders with "id" in it and search for public_html within that directories.
Next step is to exclude all files that start with ".", that also includes .htaccess and finally there is the filter for only taking files smaller than 2048000.
That piece of code just works fine and list for example all .php files that pass all that criteria, but if in one of these folders a .htaccess with the following code in it:
<FilesMatch \.php$>
deny from all
</FilesMatch>
The scanner is not able to read the .php files. I just tested it a few times and without .htaccess he can show all files including the .php but with that htaccess in it, the .php files are not shown any more. The rest of the files, for example .html is still shown.
Hopefully you understood what I mean, because I am just starting with python, maybe there is a logical mistake within my code.