The below bash
runs clamav
on all files in DIR
and produces virus-scan.log
. My question is the portion in bold is supposed to move the infected
files, lines not OK
, to /home/cmccabe/quarantine
. Does the bash
look correct? Thank you :).
virus-scan.log
Mon Jan 16 14:39:05 CST 2017
/home/cmccabe/Desktop/NGS/API/R_2017_01_13_14_46_04_user_S5-00580-25-Medexome/IonXpress_008_xx-xxx_R_2017_01_13_14_46_04_user_S5-00580-25-Medexome.bam.bai: OK
/home/cmccabe/Desktop/NGS/API/R_2017_01_13_14_46_04_user_S5-00580-25-Medexome/IonXpress_007_xx-xxx_R_2017_01_13_14_46_04_user_S5-00580-25-Medexome.bam: OK
/home/cmccabe/Desktop/NGS/API/R_2017_01_13_14_46_04_user_S5-00580-25-Medexome/IonXpress_007_xx-xxx_R_2017_01_13_14_46_04_user_S5-00580-25-Medexome.bam.bai: OK
#!/bin/bash
DIR=/home/cmccabe/Desktop/NGS/API
cd $DIR
line_no=$(ls | awk -F . '{print $NF}' | sort | uniq -c | awk '{print $2,$1}') # count folder type and store as variable
echo "The folders detected are:
$line_no"
# Get rid of old log file
rm $HOME/virus-scan.log 2> /dev/null
for FILE in $DIR;
do
# check file length is nonzero otherwise commands may be repeated
if [ -s $FILE ]; then
date > $HOME/virus-scan.log
clamscan -r $FILE >> $HOME/virus-scan.log
if grep -iq "OK" "${file}"; then
echo "echo nothing detected by scan"
else
if [[ -f "$f" ]]; then
mv -f "$f" /home/cmccabe/Desktop/API/$filename /home/cmccabe/quarantine
# rm -f "$f"
echo "The files infected have been moved to the folder at /home/cmccabe/quarantine"
fi
fi
done
Hi cmccabe, I think the script will need work.
First the script goes in to the directory $DIR and then iterates in a for loop over one single value, the contents of $DIR, which is the name of the parent directory: /home/cmccabe/Desktop/NGS/API
. Probably because clamscan also takes directories as an argument, the command will eventually work, but no thanks to the script.
Likewise, [ -s $FILE ]
tests that directory again so that also serves no purpose and the condition will always be true.
Then a grep is performed on the same directory as if it were a regular file and it test for the case insensitive ok (which in itself is a very bad test since it will easily give false positives). This will fail, since since it is not a file, but an empty string (the uninitialized variable file
is empty that does not contain the characters OK.
So then it tests with [[ -f "$f" ]]
if the empty string (uninitialized variable f
is empty) is a file, which is not the case, so fortunately the rest of the code will be skipped, otherwise it would have move the entire directory /home/cmccabe/Desktop/API
to /home/cmccabe/quarantine
.
1 Like
Using some helpful suggestions from @MadeInGermany as well as yourself. Not sure how to address the grep
Thank you very much :).
#!/bin/bash
DIR=/home/cmccabe/Desktop/NGS/API
log=$HOME/virus-scan.log
{
echo "The extensions are"
ls | awk -F'\.' 'NF>1 {ext[$NF]++} END {for (i in ext) print ext,i}'
} > $log
scanned=0
for FILE in "$DIR"/*
do
# check file length is nonzero otherwise commands may be repeated
if [ -s "$FILE" ]; then
{
date
clamscan -r "$FILE"
} >> $log
((scanned++))
if grep -iq "OK" "${file}"; then
echo "echo nothing detected by scan"
else
if [[ -f "$f" ]]; then
mv -f "$f" /home/cmccabe/Desktop/API/$filename /home/cmccabe/quarantine
# rm -f "$f"
echo "The files infected have been moved to the folder at /home/cmccabe/quarantine"
fi
fi
done
[ $scanned -eq 0 ] && echo "nothing detected by scan" >> $log
What would happen with an infected file called This_file_OK_and_not_infected
? I would suggest that your grep will ignore it.
I have this section of code reading the output:-
while read line
do
line="${line% FOUND}"
virus_name="${line#* }"
file_name="${line%: *}"
((virus_count=$virus_count+1))
printf " %s\n" "${file_name}" # Output to screen
printf "%s\n" "${file_name}" >&3 # Output to log_file
done < <(grep " FOUND$" $scan_log) 3>log_file
Obviously the scan_log is defined earlier and written to by clamav
This then gives me output to screen and in the file log_file with a list of infected files, which I then deal with.
Does this help?
Robin
1 Like
So if I am following correctly, something more like:
#!/bin/bash
DIR=/home/cmccabe/Desktop/NGS/API
log=$HOME/virus-scan.log
{
echo "The extensions are"
ls | awk -F'\.' 'NF>1 {ext[$NF]++} END {for (i in ext) print ext,i}'
} > $log
scanned=0
for FILE in "$DIR"/*
do
# check file length is nonzero otherwise commands may be repeated
if [ -s "$FILE" ]; then
{
date
clamscan -r "$FILE"
} >> $log
((scanned++))
while read line
do
line="${line% FOUND}"
virus_name="${line#* }"
file_name="${line%: *}"
((virus_count=$virus_count+1))
printf " %s\n" "${file_name}" # Output to screen
printf "%s\n" "${file_name}" >&3 # Output to log
done < <(grep " FOUND$" $scan_log) 3>log
echo "The files infected have been moved to the folder at /home/cmccabe/quarantine"
fi
fi
done
[ $scanned -eq 0 ] && echo "nothing detected by scan" >> $log
Thank you for your help :).
I'm not sure why you have the loop for for FILE in "$DIR"/*
when you follow it up with clamscan -r "$FILE"
The -r
flag asks clamscan to recursively search. This will call clamscan once for each item in the directory. Can you not just clamscan -r "$DIR"
instead? I find that running clamscan has a several second overhead as it loads up the definitions. You could be scanning for hours just on calling the process repeatedly. An alternate might be to list the files into another file and use that as input with the -f
flag, e.g. clamscan -if /tmp/file_list.txt
I've added the -i
flag to only list infected files, which might make reading the output easier.
You have the basis of some good code here though, keep going
Do you have a virus signature to test this with?
Robin