Looping through all folders in a folder.

I am trying to write a script that loops through all the folders within a given folder.
My intention is to access each folder and rename each file ending with fna.gz with the name of the folder it resides in.

#!/bin/bash

cd /p/w/d/

for f in /p/w/d/*; do
   echo $f

done

I'm going to take this step by step because first of all at this point if I print "f" it prints the whole pwd including the folders i.e. p/w/d/folder1. Instead of printing just folder1. So firstly how can I get it to print just folder1?

find . -name *fna.gz | while read f ; do fl=${f%/*}; fld=${fl##*/} ; echo mv "$f" "$fl/${fld}_${f##*/}" ; done

Test first. To perform rename, update as needed then remove echo

1 Like

If you remove the full path from the for loop, what does it give you?

#!/bin/bash
cd /p/w/d/
for f in *; do
   echo $f
done
for a in `find /p/w/d -type f -name '*.fna.gz'`
{
 d=${a%/*}
 mv $a ${a%.*}.${d##*/}
}

It returns unwanted files and folders.

What were "unwanted folders" in the context of your post#1?

1 Like

Thanks, your solution works. I'm just trying to decode what you've done.

So I see here that the find function looks for files of a given type i.e. .fna.gz.
And then the pipe ( | ) passes each file to be read.

 find . -name *fna.gz | while read f ; do 

Now as the files are being read something is going on that I can't make out.
You seem to be creating 2 variables, the first of which captures the name of the folder and then the captures the name of the file in that folder. Is this correct?
Is this a regex? What do the values {}, % and # mean?

fl=${f%/*}; fld=${fl##*/} ; 

And then here you rename the files.

mv "$f" "$fl/${fld}_${f##*/}" ;  

Yes, the two variables for file name and folder are created, but no, not with regexes but bash "Parameter Expansion". man bash :

1 Like

It lists all the directories from the root, even though I've changed directories to where I want the work to be done. It's okay @rdtx1's solution worked. I just need to build on that now.

---------- Post updated at 11:07 AM ---------- Previous update was at 10:58 AM ----------

What's the difference between 2 %'s and one %?

man bash :

1 Like

My directory: /folder1/folder2/file.fna.gz

Now following on from renaming the files of interest (fna.gz) I now need to loop through folder1.
As I loop through folder1 I am trying to first of all capture the first word in the file before the first underscore, for example if the folder name is /Long_John_Silver/ then I am trying to capture only "Long".
Then I want to check whether there is a folder with the same name "Long" in a distant directory /another/directory/ and if there is then a move onto the next folder in folder1, but if there isn't then I create a new folder with the captured name i.e. "Long".

Here's my work so far, I am struggling with the capturing of "Long" part.

 
for fldr in /folder1/folder2/.fna.gz ; do

## Trying to regex capture here but I'm unclear on the syntax, there's expansion, grep  etc but don't know which to use. 
    new_fldr = ${fldr} 

## Then the test for existence.
    if [-f /another/directory/$new_fldr ]; then

        echo "Folder exists already, moving on..." ;

    else [ ! -f /another/directory/$new_fldr ];

        echo "Folder not present, creating folder..." ;

        cd /another/directory/ ;

        mkdir $new_fldr ;

Any help is much appreciated.

Without a done matching the do in:

for fldr in /folder1/folder2/.fna.gz ; do

you don't have a loop; you have a syntax error. If you fix the syntax error, you then have a loop that executes once with fldr having been set to the string /folder1/folder2/.fna.gz . There is no chance of finding the string Long , the string Long_John_Silver , or even just an <underscore> character in the value assigned to fldr ! If you want to loop through a bunch of directories, you need to either provide a bunch of directories as separate operands on the for statement or use an operand to that for statement that provides a filename matching pattern with wildcard characters that will match a bunch of existing directory pathnames.

An if statement such as:

if [-f /another/directory/$new_fldr ]; then

) without a matching fi is not an if statement; it is another syntax error. If you add the missing fi in an appropriate place, you're then left with your condition in that if statement ( [-f /another/directory/$new_fldr ] which has both syntax and semantic errors. There have to be breaks between the utility name ( [ in this case) and the operands you want to pass to that utility. You do not have a break between the [ and the -f . If you add a <space> between those to get the needed break you are then left with a semantic error. The -f test primary tests whether or not the operand is a regular file, but your comments say that you're looking for a directory. The test primary to look for a directory is -d ; not -f .

The command new_fldr = ${fldr} is not an assignment statement in the bash shell command language. An assignment statement doesn't have any breaks between the variable name, the <equals-sign> character, and the value to be assigned to that variable. The command you have given is an attempt to execute a command named new_fldr with a 1st operand = and a 2nd operand /folder1/folder2/.fna.gz .

If we knew what pathnames you were trying to match, we might be able to help you come up with a pathname matching pattern or a find statement that could be used to pipe pathnames (or only directory names) into a loop that would get what you want. Since you haven't given us any indication where these directories are located in the file hierarchy rooted in /folder1/folder2 , there is no way that we can suggest code that might fix your problems.

My script works now.
I now need to add another step to the process and I'll no doubt be back if I run into a stumblingblock again.

Unless you REALLY need the logging echo es, and exploiting mkdir 's -p option (that creates missing directories but ignores existing ones) and deploying bash 's "parameter expansion" (as shown before), that can be reduced to

for fldr in /home/p995824/scripts/playground/genomes_2/*; do TMP="${fldr##*/}"; mkdir -p /home/p995824/programs/kmerid-master/ref/"${TMP%%_*}"; done