Need to delete large set of files (i.e) close to 100K from a directory based on the input file

That would have been nice to know three pages ago...

while read ID
do
        echo "${ID}.jpg"
        echo "thumbnail/${ID}-tn.jpg"
done < inputfile | xargs 2> errlog

Thank you for the quick response and the deletions are happening as expected but error log capturing and the files which have been deleted are not getting captured as expected from the below script.

# Usage:  rm_list.sh list_file error_file  removed_file
#
while read ID
do
        [ -f "/photos/${ID}.jpg" ] && echo "/photos/${ID}.jpg"
        [ -f "/photos/thumbnail/${ID}-tn.jpg" ] && echo "/photos/thumbnail/${ID}-tn.jpg"
done < ID >list_file

if [ $# -ne 3 ]
then
   echo "Usage:"
   echo "   rm_list.sh list_file error_file removed_file"
   exit 1
fi
 
LIST=$1
ERR=$2
REM=$3
 
if ! [ -f $LIST ]
then
    echo "List file $LIST not found"
    exit 2
fi
 
xargs rm < $LIST 2> $ERR
nawk -F: 'NR==FNR{d[$1]++;next} !($0 in d)' $ERR $LIST > $REM

Thanks

You appear to have posted the code from the brief moment in time before I removed the [ -f filename ] && echo filename bits. You don't get missing file errors because it doesn't put missing files into the list in the first place.

Thank you for the update and so what can be done in order to log the errors, because when you are looking for the files from the below code, you definitely need an output file from which we can delete the files and at the same time we need an error file.

while read ID
do
        [ -f "/photos/${ID}.jpg" ] && echo "/photos/${ID}.jpg"
        [ -f "/photos/thumbnail/${ID}-tn.jpg" ] && echo "/photos/thumbnail/${ID}-tn.jpg"
done < ID >list_file

You quoted my correct code in your post last time but didn't use it, use my correct code.

Below code was going no where as it is just hanging for long times and i killed the script. Can you please let me know how can i capture error file as i am able to capture list_file.

# Usage:  rm_list.sh list_file error_file  removed_file
#
while read ID
do
       echo "/pictures/${ID}.jpg"
       echo "/pictures/thumbnail/${ID}-tn.jpg"
done >list_file 

if [ $# -ne 3 ]
then
   echo "Usage:"
   echo "   rm_list.sh list_file error_file removed_file"
   exit 1
fi
 
LIST=$1
ERR=$2
REM=$3
 
if ! [ -f $LIST ]
then
    echo "List file $LIST not found"
    exit 2
fi
 
xargs rm < $LIST 2> $ERR
nawk -F: 'NR==FNR{d[$1]++;next} !($0 in d)' $ERR $LIST > $REM

I am able to capture list_file from the below code:

while read ID
do
        [ -f "/photos/${ID}.jpg" ] && echo "/photos/${ID}.jpg"
        [ -f "/photos/thumbnail/${ID}-tn.jpg" ] && echo "/photos/thumbnail/${ID}-tn.jpg"
done < ID >list_file

---------- Post updated at 01:43 PM ---------- Previous update was at 01:27 PM ----------

Btw when i am using this code with input file named ID which contains all 8 digit numbers and once the while loop is executed what is the file named inputfile from your command? if i execute the same script it says it cannot open input file.

If you're reading ID's from standard input, leave <inputfile off.

So how can i capture the list_file (i.e) files to be deleted and how to capture error_file (i.e) files not found based on ID file.

I can get the list_file (i.e) files to be deleted if i use the below code.

while read ID
do
        [ -f "/photos/${ID}.jpg" ] && echo "/photos/${ID}.jpg"
        [ -f "/photos/thumbnail/${ID}-tn.jpg" ] && echo "/photos/thumbnail/${ID}-tn.jpg"
done < ID >list_file

can you help me in order to get the error_file capture?

No, you can't. Use the other code.

Read from whatever file you want with <, or nothing at all by leaving it off completely.

So I've tried a lot different ways in capturing files not found for above command, no luck. can someone guide me in capturing for the files not found from the input file (i.e) ID.

In what way does it not work?

I have tried in the below way and have even tried with some find commands.

while read ID
do
        [ -f "/photos/${ID}.jpg" ] && echo "/photos/${ID}.jpg"
        [ -f "/photos/thumbnail/${ID}-tn.jpg" ] && echo "/photos/thumbnail/${ID}-tn.jpg"
done < ID >list_file 2>error_file

Yes, and in what way did they not work?

I can only get the info in the list_file and the error_file is empty. Am i doing it in the right way? Please correct me if i am wrong.

I don't think you're really looking at the code.

while read ID
do
        [ -f "/photos/${ID}.jpg" ] && echo "/photos/${ID}.jpg"
        [ -f "/photos/thumbnail/${ID}-tn.jpg" ] && echo "/photos/thumbnail/${ID}-tn.jpg"
done < ID >list_file 2>error_file

This is the code I told you not to use.

I don't know why you'd expect error_file to get anything inside it when all you're doing is echo's inside the loop. Read it -- there's no mv inside there.

I'm hoping the ID I highlighted blue is a valid filename. It's supposed to be the input file you're reading ID's from. If it's not, it's wrong.

Try this:

while read ID
do
        echo "${ID}.jpg"
        echo "thumbnail/${ID}-tn.jpg"
done < inputfile | xargs rm 2> errlog

I left off the 'rm' by accident. It must have been printing all the filenames, as xargs does when it isn't given a program to run. If you'd mentioned this, that would have helped discover the problem. But all you ever say is 'not working'...

1 Like

Thank you for the post and is there a way can we only get the output file and error_file but rm operation will be done later.

while read ID
do
        echo "${ID}.jpg"
        echo "thumbnail/${ID}-tn.jpg"
done < inputfile | xargs rm 2> errlog

So where ID is my input file and i dont want rm operation here and i want to capture the output file and then the files not found in the error_file. Belwo code works perfect to get the output file (i.e) which consists all the file names to be deleted, but the same way can i get the error_file (i.e) files not found.

while read ID
do
        echo "${ID}.jpg"
        echo "thumbnail/${ID}-tn.jpg"
done < ID >list_file

---------- Post updated 08-24-12 at 11:51 AM ---------- Previous update was 08-23-12 at 05:05 PM ----------

Is there a way that we can capture what files have been deleted from the below while loop?

while read ID
do
        echo "${ID}.jpg"
        echo "thumbnail/${ID}-tn.jpg"
done < inputfile | xargs rm 2> errlog

---------- Post updated at 11:52 AM ---------- Previous update was at 11:51 AM ----------

while read ID
do
        echo "${ID}.jpg"
        echo "files/${ID}-tn.jpg"
done < ID | xargs rm 2> errlog

Input file here is ID

Not unless you have a psychic computer. How should it know which ones don't get deleted, before it's deleted them...?