Issue with rm command in a tera byte file system

We have a production file system which has 6+ million files with more than 1 tera byte in size. When trying to delete selective files through a weekly script files are not deleted.

Please advise with ideas.

What is the output from ls -l file... for the files you want to delete?

What is the output from ls -ld directory for the directory (or directories) containing the file(s) that you couldn't delete?

What diagnostics did rm file... produce for the files that were not removed?

What user and groups privileges are being used when the script is running? (I.e., what is the output from the command id ?

What shell is being used to run the script that is failing? What are the exact commands in the script that are failing? What operating system (including version) are you using?

Is this cron scheduled and if so does your script get called properly, setting up the appropriate environment before it starts doing active work?

I would guess that an rm request takes so long that you interpret the wait as not working - assuming that the job that deletes files 'worked' before.

Some filesystems do not perform well on super huge numbers of files in directories, and response times can literally be an hour for a simple

ls /path/to/files/*.extension

command.

For starters please post the script and mention what Operating System and Shell you have.

At a guess your script generates a command which is too long for the Operating System.

Bearing in mind that we know nothing about your Operating System or Shell, the generic solution format is something like this (in ksh) where the criteria is to delete files more than 180 days old :

find /cullable_tree/ -type f -mtime +180 -print | while read FILENAME
do
     if [ -f "${FILENAME}" ]
     then
              echo "${FILENAME}"
              # rm "${FILENAME}"
              # (Uncomment when tested thoroughly)
     fi
done

The "if -f" covers circumstances where the filename contains weird characters.

This sort of construct never generates long command lines and may not be fast but it gets there in the end.

You could shorten this to a single-line find command, but can you (kppublicmail) answer the questions posed in responses. It will mean that you get more useful suggestions.