Delete files listed in text file

kmanivan82 · February 6, 2015, 5:48pm

Hi Team,

Here's the scenario,

I have a text file called "file_list.txt". Its content is as follows.
111.tmp
112.tmp
113.tmp
114.tmp

These files will present in "workdir" directory. It has many files. But only the files present in file_list.txt has to be deleted from the workdir directory. We have the code as follows.

while read line
do
	rm -f $workdir/${line}
done < file_list.txt

But we are facing performance issue in using while loop. In production, we will have more than 100000 files list in the text file. Can anyone help us to provide a code to improve the performance.

Thanks

RudiC · February 6, 2015, 6:04pm

I'm afraid deleting 100000 files from a directory will take its time, esp. on a production system where other things are going on as well.
cd ing to the $workdir would not improve performance significantly; you might want to give

while read A; read B; read C; do echo rm -f $A $B $C; done < file

a try, as it saves two thirds of process creations. Or even

< file xargs -n8 echo rm -f

, eliminating 7 process creations.

kmanivan82 · February 6, 2015, 7:07pm

Thank you RudiC for your reply.

There is a small change in the given requirement. Actually the file names will not have .tmp in the "file_list.txt". While deleting I need to append the filename with ".tmp" while deleting.

Can you help me out?

mjf · February 6, 2015, 9:21pm

kmanivan82,
To use RudiC's 2nd suggestion, try:

awk '{print $1".tmp"}' file_list.txt | xargs -n8 echo rm -f

remove echo once you have tested.

RudiC,
It doesn't appear your 1st suggestion will process the last 3 records/file names from input file if total number of records/file names on file are not evenly divisible by 3.

Don_Cragun · February 6, 2015, 10:36pm

You're right. But for the original problem:

while read A; do read B; read C; echo rm -r $A $B $C; done < file

would have worked.

With the new filename modification requirements, this approach needs more work.

RudiC · February 7, 2015, 2:40am

Thanks. When testing with 8 lines, I had an extra empty line in my file which I didn't notice...

kmanivan82 · February 7, 2015, 4:36am

Thank you RudiC and mjf. The performance statistics is as follows.

using while loop
**** Script started - 2015-02-07 02:18:52 ****
**** Script ended at 2015-02-07 02:19:45 ****
Duration: 53 seconds to delete 5400 files

using the syntax given by mjf/RudiC
**** Script started - 2015-02-07 02:05:24 ****
**** Script ended at 2015-02-07 02:05:33 ****
Duration: 9 seconds to delete 5400 files

Thank you once again.

MadeInGermany · February 7, 2015, 10:34am

If -n 8 gives you 6x speed, then you should try without it i.e. no limit.
Further, try to add a .tmp with the help of -i :

<file_list.txt xargs -i rm -f {}.tmp

If -i is not available try

<file_list.txt xargs -I {} rm -f {}.tmp