Remove files from subdirectories given a list of filenames

Dear all,

I have a dir structure like

                                    main_dir
                  At_nn                Ag_js              Nf_hc .... 
           mcd32 mgd43...     mcd32 mgd43...         mcd32 mgd43...  

and each subdir (e.g. mcd32, mgd43) contains files.

Now, i have a list (tobedelete.txt) of filenames (not the complete path). I want to delete only those files which are in the tobedelete.txt.

I am trying like
first I go to the dir main_dir and

find . -name tobedelete.txt -exec rm {};

but doesnt work.
any comment will be appreciated.
cheers.

You could try the following;

for i in `cat tobedelete.txt`
do
find . -name ${i} -exec rm {} \;
done

Although to be sure it's going to remove the right files i'd do this first.

for i in `cat tobedelete.txt`
do
find . -name ${i} -print
done

Regards

Dave:)

There is a fundamenal understanding problem here, The "find" command posted will look for a file called "tobedelete.txt" and delete that file (not the files listed in that file).

The code posted by "gull04" is scary.

Please post the actual directory structure in machine-readable format:

I've used this type of routine many times, in view of the original post made by yogeshkumkar and given that I pointed out that he should print out the list of the find command I'd just like to understand how my code was "scary".

I'm quite happy to admit that I'm as likely to screw up as the next guy, but in this case I can't really see what is wrong with the advice that I gave. As I'm always keen to learn, I'd appreciate your feedback.

I don't mean to be critical, I'm just interested to know - I wouldn't deliberately give someone poor advice. I really didn't think that my post was too far off the mark, I use "find" this way frequently - I'm quite happy to admit my shortfalls but in this case I thought the advice was OK. Based on the original post I didn't think this was poor advice, I was just trying to solve a problem for someone.

Perhaps I should have said that the code should be "rm -i" and you can check each file prior to deletion.

for i in `cat tobedelete.txt`
do
find . -name ${i} -exec rm -i {} \;
done

Regards

Dave

Thanks for the comments Methyl and Dave.
The answer by Dave doesnt work. it simply removes the content of the tobedeted.txt.

The actual dir structure is like

in the main_dir
there are subdirs like AAA, BBB, CCC, ....

inside AAA there are subdirs like aa1, aa2, aa3, ...
inside aa1 there are files like AAA_aa1_file1.txt, AAA_aa1_fileextra.txt, AAA_aa1_mod43a4.txt, ....
inside aa2 there are files like AAA_aa2_file1.txt, AAA_aa2_fileextra.txt, AAA_aa2_mod43a4.txt, ....

inside BBB there are subdirs like bb1, bb2, bb3, ...
inside bb1 there are files like BBB_bb1_file1.txt, BBB_bb1_fileextra.txt, BBB_bb1_mod43a4.txt, ....
inside bb2 there are files like BBB_bb2_file1.txt, BBB_bb2_fileextra.txt, BBB_bb2_mod43a4.txt, ....

and so on...up to many hundred of files. I managed to the get the filenames which I must delete.

e.g. the content of tobedeleted.txt is

AAA_aa1_fileextra.txt
BBB_bb1_file1.txt
BBB_bb1_mod43a4.txt
CCC_cc4_mfg54d2.txt
GGG_gg6_hd21s1.txt
and so on....

i hope there is a way!

Hi yogeshkumkar,

What does the command Return?

for i in `cat tobedelete.txt`
do
find . -name ${i} -print
done

Just for information, the tobedeleted.txt file should be in the directory where you are running the script from or you should enter the pat to the file as well such as "for i in `/path/to/file/tobedeleted.txt`"

Regards

Dave

Thanks Dave,

find: paths must precede expression
Usage: find [-H] [-L] [-P] [path...] [expression]

the file "tobedeted.txt" is in the same dir where AAA, BBB, and so on are.

pwd: /my/path/main_dir/
find . -name "tobedeleted.txt" | xargs rm

it works only if I have full path of each filename in the tobedeted.txt

how can i create the full path for each filename?

Hi yogeshkumkar,

Not a problem, glad that I could help.

Regards

Dave

---------- Post updated at 01:44 AM ---------- Previous update was at 12:59 AM ----------

Hi yogeshkumkar,

What output do you get from the find with the "-print" statement?

Regards

Dave

@gull04
Your script contains a couple of common problems:
1) Uses "for" with an open-ended list.
This breaks when the command line gets too long.
Also, the "for" breaks if any filename contains a space character.
2) The "find" is not specific to files

We can turn the script round to make it a bit more robust:
For the diagnostic version we could display what we are searching for,
because I too am having trouble understanding the content of "tobedelete.txt" (or is it "tobedeleted.txt"?). There is nothing in your code which would overwrite the file.

if [ -f "tobedelete.txt" ]
then
    cat tobedelete.txt | while read filename
    do
           echo "${filename}"
           find . -type f -name "${filename}" -print
           echo ""
    done
else
      echo "File does not exist: tobedelete.txt"
fi

@yogeshkumkar
Please post what Operating System and version you have and what Shell you are using.

1 Like

@methyl,

You're absolutely right in your approach to making the script more robust and I must admit there were several basic assumptions made by me that were possibly optimistic.

I did assume;

IFS='
'

And again I'd say that I took yogeshkumkar at face value when I gave the advice, not that that is any excuse for sloppiness. I should have tested for file types incase there was a link or the likes, should have enclosed the variable to ensure that it worked around spaces etc.

I work in a fairly pressured production/dr/test/development environment. So have four of every machine from M9000's to T5120's where I always have the facility to test - possibly it has made me a little cavalier.

I'll try to remember that not everyone can restore Pb sized file systems in minutes if things go pear shaped

1 Like

@yogeshkumkar
I believe that we are waiting for some answers.

@gull04
Somebody with your experience of decent sized Sun servers is most welcome to continue to contribute to this board.

Personally I have never messed with IFS . There is always a better way.
What is a "Pb sized file system"? To me a "Tb" is quite big enough thank you.

@methyl

Thanks for the vote of confidence, Pb Petabyte 1024Tb - dead easy with ZFS. BTW thanks for being patient with me on the forum, I'm prepared to try - "never been wrong, never done anything".

Regards

Dave

1 Like

Thanks all,

I managed it. It is simple. first create the list with full paths given a list of filenames as

for i in `cat /path/to/the/filenames.txt`;
  do 
    find /path/to/the/parent/dir/to/search/ ${i} -print;
  done >& filenames_withfullpath.txt

and then simply delete the files from the list as

find /path/to/the/parent/dir/to/search/ -name "filenames_withfullpath.txt" | xargs cat | xargs rm

cheers,
Yogesh

@yogeshkumkar
DANGEROUS SCRIPT
The first "find" command posted would be a syntax error on some O/S because there is no "-name" parameter. On my O/S it produces a list of EVERY file and directory in the directory named in the first parameter. Thus your script is in danger of deleting EVERY file regardless of whether they appear in "filenames.txt".

I've commented about using "for" with open-ended lists in an earlier post.

Thanks methyl,

you are right. I actually missed the -name parameter when replied.

my OS is Linux.
Linux taiga 2.6.16.60-0.27-smp #1 SMP Mon Jul 28 12:55:32 UTC 2008 x86_64 x86_64 x86_64 GNU/Linux

yesterday it worked. but now it doesnt. it doesnt show any fullpaths of the filenames in the file "filenames.txt".

Presumably the previous run deleted all the files.