Removing large number of temp files

Hi All,

I am having a situation now to delete a huge number of temp files created during run times approx. 16700+ files. We have never imagined that we will get this this much big list of files during run time. It worked fine for lesser no of files in the list. But when list is huge we are getting "List too long error".

 
 remvfile='/etc/usr/bin/removefilelist'
 rm -f `cat $remvfile`
 

Please find the code above. The removefilelist is the file which contains the list of files to delete. Please find the sample content of removefilelist below.

 /etc/usr/workfile/filename1
 /etc/usr/workfile/filename2
 /etc/usr/workfile/filename3
 /etc/usr/workfile/filename4
 

We also don't know how many number of file we will get in the list. Probably more than 50000+. So I would like to get your help here.

Thanks.

Try:

while read X; do
    rm -f $X
done < /etc/usr/bin/removefilelist

Thanks Zaxxon,

But I would like to avoid while loops because it would introduce performance issues.

Lets consider this then:-

xargs rm -f < $remvfile

I tested on 100,000 files in a temporary directory like this:-

i=0;until [ $i -gt 99999 ]
do
   ((i=$i+1))
   touch $i
done

ls -1 > /tmp/remvfile
xargs rm < /tmp/remvfile

Does this help? There are far fewer rm commands called so it should run faster because they're not spawning a process for each file, however the IO overhead will still be the same. For my test, I actually ran rm five times.

Robin

It would probably be better to ensure that the directory that contains the temporary files has no other files in it so that:

rm -r temp_directory
mkdir temp_directory

I tend to use for loops for this kind of thing :-

for file in `cat /etc/usr/bin/removefilelist`
do
  rm -f "$file"
done

...which is a bad habit as this is a useless use of cat, and dangerous use of backticks.

1 Like

Hi.

I'm with Robin on this: use xargs . You can control how many items to process, how many characters, etc. I also like the idea from jgt of rm -rf <directory> , provided care is taken that no valuable files are present in the directory.

One problem can be filenames with shell-specific meta-characters in them, the most common being a space, like t 10 . Here are two methods for dealing with that situation:

#!/usr/bin/env bash

# @(#) s3       Demonstrate remove files in groups, xargs.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C /bin/ls xargs tr

# Remove debris from previous runs.

FILE=${1-data2}

# Create a number of files.

pl " Temporary files:"
rm -f t*
touch t{1..9}
/bin/ls t*

# Put temporary files into a list.
/bin/ls -1 t* > $FILE

pl " Input data file $FILE, columnized:"
column $FILE

pl " Results, xargs, default command is echo:"
xargs < $FILE

pl " Files are still present:"
/bin/ls t*
echo " Exit status from ls: $?"

pl " Results, change command to rm, files are now gone, expect message:"
xargs rm < $FILE
/bin/ls t*
echo " Exit status from ls: $?"

# Now with a file with a special character -- a space -- in it.
pl " Temporary files:"
rm -f t*
touch t{1..9} "t 10"
/bin/ls t*

pl " Results, not all files are gone:"
xargs rm < $FILE
/bin/ls t*
echo " Exit status from ls: $?"

# Again with a file with a special character -- a space -- in it.
pl " Temporary files:"
rm -f t*
touch t{1..9} "t 10"
/bin/ls t*

pl " List files with quotes around them, (displayed columnized):"
/bin/ls -Q t* |
tee $FILE |
column

pl " Results, files are now gone, expect message:"
xargs rm < $FILE
/bin/ls t*
echo " Exit status from ls: $?"

# Third time, with a file with a special character -- a space -- in it.
pl " Temporary files:"
rm -f t*
touch t{1..9} "t 10"
/bin/ls t*

pl " Add a null character to the end of each name:"
/bin/ls -1 t* |
tr '\n' '\0' |
tee $FILE 

pl " Results, files are now gone, expect message:"
xargs --null rm < $FILE
/bin/ls t*
echo " Exit status from ls: $?"

exit 0

producing:

$ ./s3

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.6 (jessie) 
bash GNU bash 4.3.30
/bin/ls ls (GNU coreutils) 8.23
xargs (GNU findutils) 4.4.2
tr (GNU coreutils) 8.23

-----
 Temporary files:
t1  t2  t3  t4  t5  t6  t7  t8  t9

-----
 Input data file data2, columnized:
t1      t2      t3      t4      t5      t6      t7      t8      t9

-----
 Results, xargs, default command is echo:
t1 t2 t3 t4 t5 t6 t7 t8 t9

-----
 Files are still present:
t1  t2  t3  t4  t5  t6  t7  t8  t9
 Exit status from ls: 0

-----
 Results, change command to rm, files are now gone, expect message:
/bin/ls: cannot access t*: No such file or directory
 Exit status from ls: 2

-----
 Temporary files:
t 10  t1  t2  t3  t4  t5  t6  t7  t8  t9

-----
 Results, not all files are gone:
t 10
 Exit status from ls: 0

-----
 Temporary files:
t 10  t1  t2  t3  t4  t5  t6  t7  t8  t9

-----
 List files with quotes around them, (displayed columnized):
"t 10"  "t1"    "t2"    "t3"    "t4"    "t5"    "t6"    "t7"    "t8"    "t9"

-----
 Results, files are now gone, expect message:
/bin/ls: cannot access t*: No such file or directory
 Exit status from ls: 2

-----
 Temporary files:
t 10  t1  t2  t3  t4  t5  t6  t7  t8  t9

-----
 Add a null character to the end of each name:
t 10t1t2t3t4t5t6t7t8t9
-----
 Results, files are now gone, expect message:
/bin/ls: cannot access t*: No such file or directory
 Exit status from ls: 2

See man pages for details.

Best wishes ... cheers, drl

1 Like