Finding and renaming files with exceptions

Hello all,

I am a new ubuntu user (have to use it for work) and I am trying to learn and familiarize myself with commands that I will be using frequently.

I would like some help in how I can get a list of all files with certain keywords in the filename.
For example, I have a directory with numerous subdirectories that have a bunch of files in them. Two of those files contain the following format in the name: numericals_eddy_corrected.nii.gz and numericals_eddy_corrected_brain_mask.nii.gz .
I want to get a list of all the eddy_corrected.nii.gz and eddy_corrected_brain_mask.nii.gz files.

I am wondering, what command do I type in terminal so that I get a list, including the path, of all the files that have eddy_corrected in the name without the other eddy_corrected_brain_mask ' showing up as well and vice versa? Is there a way to have terminal make a .txt file of the list?

Second thing I need help with is renaming some files in those subdirectories. These files end in the extension .bvec and .bval and have absurdly long names with numbers in the format of subjectnumberDTISiementsTClessnumericals.bvec/.bval or 1000047785DTISiemensTCless005.bvec etc.
How can I rename all the files in the subdirectories to something simpler or shorter like subjectnumber_DTI.bvec ?

I have heard of find and grep but I'm not sure how to go about combining commands to achieve what I need to do.
I will be required to do similar tasks in the future so I figured I should learn it once and for all. I would appreciate any help.

If I have posted this in the wrong forum, please feel free to move it to the correct location.

Thank you in advance.

Welcome to the UNIX & Linux Forums. We are here to help you.

Please search the forum for similar questions that have already been answered before posting a new question. Without knowing what the format of other filenames in your directories, you could use a find command searching exactly for filenames ending with those two strings in the file hierarchy rooted in the current working directory:

find . -name '*eddy_corrected.nii.gz' -o -name '*eddy_corrected_brain_mask.nii.gz'

If there aren't any other names containing eddy_corrected that end with nii.gz you could use the simpler:

find . -name '*eddy_corrected*nii.gz'

The terminal application running in your window manager emulates a terminal from days of yore. The terminal application just emulates hardware, it doesn't run commands, it doesn't create file, it doesn't do anything except emulate a terminal device. When you start (or your system starts) the terminal application, you specify a program to run in that terminal window. Usually the program that you run in a terminal window will be a command interpreter such as bash , csh , ksh , or zsh . These command interpreters are known as shells. The commands recognized by different shells varies. All of the examples I'm suggesting in this post will work with bash , ksh , and zsh ; some, but not all, will work in csh .
Adding > filename to the end of either of the above commands will redirect the output from those commands to the file named filename (creating the file if it does not exist, and overwriting its previous contents if it did exist). Using >> filename will append the output of those commands will create the file if it did not exist and append the output to the end of the file if it did exist.

You could try something like:

find . -name '*DTISiemensTCless*.bvec' -o -name '*DTISiemensTCless*.bval' |
while read -r path
do	dir="${path%/*}"
	file="${path##*/}"
	subjectnumber="${file%%DTI*}"
	ext="${file##*.}"
	echo mv "$path" "$dir/${subjectnumber}DTI.$ext"
done

(this assumes that you want to preserve the .bvec or .bval at the ends of the pathnames) and if the output looks like it correctly supplies the proper arguments to rename files the way you want them to be, remove the echo shown in red and rerun the command to actually rename the files. If you want to the new names of all of the files to have the extension .bvec , change the two lines:

	ext="${file##*.}"
	echo mv "$path" "$dir/${subjectnumber}DTI.$ext"

to be just:

	echo mv "$path" "$dir/${subjectnumber}DTI.bvec"

This is a perfectly reasonable forum for this topic.

Could you explain the syntax of the find code you have provided? I had searched google and used the following which is a bit different from what you have and I am wondering if you could explain the difference so I can understand it better? find . -name '*eddy_corrected_brain.nii.gz*'

Ahh I see now. Sorry I'm still figuring out Linux and how things are referred to in this environment. I think the work computer uses bash.

Okay so to make sure I understand correctly, if I use find . -name '*eddy_corrected*nii.gz' > filename01 it will create a file with all of the results that show up? How do I make it so that the file is a .txt file? Do I just add > filename.txt ?

Again, if it's not too much trouble, could you explain the lines of code so I can understand what those words/commands are doing? It's just that at first glance, it all seems a bit daunting so perhaps if I know the purpose of each line/word/character - I can get a better idea of how the command works.

If I just want to find the .bvec and .bval files separately (run the command twice) do I change the code as such?

find . -name '*DTISiemensTCless*.bvec' |
while read -r path
do	dir="${path%/*}"
	file="${path##*/}"
	subjectnumber="${file%%DTI*}"
	ext="${file##*.}"
	echo mv "$path" "$dir/${subjectnumber}DTI.$ext"
done

Thank you again!

Hi azurite,
First, a suggestion: Don't be afraid to try things and watch what happens!

And, another suggestion: Instead of searching the web to try to figure out what a utility on your system will do, read the manual page for that utility on your system. For instance, the command:

man find

will show you the manual page for the find utility on your system.

You still haven't told us what shell you're using. The manual page for your shell will describe pipelines, the while loop construct, the read built-in utility, variable assignments, parameter expansions, and file redirections. And, the mv manual page will describe how it can be used to rename files.

Unlike Windows, UNIX systems determine the type of a file by its contents; not by its name.

The command: find . -name '*eddy_corrected_brain.nii.gz*' searches the file hierarchy rooted in the current directory ( . ) for files with names that contain the string eddy_corrected_brain.nii.gz proceeded and followed by strings of 0 or more characters ( * ) and writes the pathname of each matching filename on a line by itself. Note that I did NOT include an asterisk at the end of the pattern I used in the find commands I suggested because I thought you only wanted filenames ending in .gz ; not filenames containing .gz followed by other arbitrary text.

Since pathnames cannot contain NUL bytes, this output will be a text file unless the pathnames produced are longer than the number of bytes your system allows in a line in a text file. The number of types allowed on all UNIX and Linux systems is at least 2048. The actual number on your system can be found using the command:

getconf LINE_MAX

With no file redirection (as above) the output is displayed on your terminal. If you redirect the output as in any of the following:

find . -name '*eddy_corrected_brain.nii.gz' > file
find . -name '*eddy_corrected_brain.nii.gz' > file.txt
find . -name '*eddy_corrected_brain.nii.gz' > file.pdf

the output file will be a text file. By convention, storing text in a file with the filename extension .pdf is a VERY bad idea, but the name you choose for your output does not in any way affect the output placed in that file by the find utility.

If you want to see what a shell script is doing, turn on tracing and watch what it does while it is running. For example to add tracing to the script:

set -xv
find . -name '*DTISiemensTCless*.bvec' -o -name '*DTISiemensTCless*.bval' |
while read -r path
do	dir="${path%/*}"
	file="${path##*/}"
	subjectnumber="${file%%DTI*}"
	ext="${file##*.}"
	echo mv "$path" "$dir/${subjectnumber}DTI.$ext"
done
set +xv

The set -xv at the start of the script turns tracing on and the set +xv at the end of the script turns tracing off.

In the above pipeline, find prints the pathnames of all filenames containing the string DTISiemensTCless that end with the string .bvec or ( -o ) with the string .bval , the read command sets the variable path to the contents of one of those lines of output from find and the while loop contains parameter expansions and variable assignments that set dir to the name of the directory specified in path , file to the last component of path , subjectnumber to the part of that file's name before the string DTI , ext to the file's extension (not including the period), and the mv command renames the file named by that line of output from find to be a file in the same directory, with the same subject number that was in the original file followed by the string DTI followed by the filename extension that was on the original file.

1 Like

Hello,

Thank you for the explanations. I think I'm beginning to understand it a bit better now. I work on ubuntu on the work computer so I can't really experiment right now lest I mess something up. I'm in the process of trying to install ubuntu on my laptop via virtualbox so hopefully I can start experimenting soon.
I think the work computer uses bash but I will check soon. Thank you for the instructions on how to do that.

I have one more question, I found this bit of commands in a tutorial that was somewhat related to my work and I was wondering if you could tell me what the command does?

[bash]mv *.bvec bvecs[/bash]
[bash]mv *.bval bvals[/bash]

Again, thank you for your assistance!

Hi,
You're welcome. First, note that the commands:

set -xv
find . -name '*DTISiemensTCless*.bv[ea][cl]' |
while read -r path
do	dir="${path%/*}"
	file="${path##*/}"
	subjectnumber="${file%%DTI*}"
	ext="${file##*.}"
	echo mv "$path" "$dir/${subjectnumber}DTI.$ext"
done
set +xv

will make ABSOLUTELY NO CHANGES to any files on your system unless you remove the echo . Let me expand a little on what I said earlier: There is NO WAY to learn how to run shell commands and write shell scripts other than to run shell commands and write and run shell scripts. Run the above commands, go through the trace output line by line (referring to the bash man page) if you don't understand how the various parameter expansions are working when it is extracting the directory ( dir ), filename ( file ), the subject number ( subjectnumber ), and the filename extension ( ext ) from the pathnames ( path ) found by the find command, and look at the mv command that is printed by the echo command.

And, for the commands:

[bash]mv *.bvec bvecs[/bash]
[bash]mv *.bval bvals[/bash]

... I repeat: You have to run shell commands to learn what shell commands do!

And, if you try running those commands you are very likely to get a diagnostic message saying that there is no command named [bash]mv unless there is a utility named amv , bmv , hmv , smv , or [bash]mv on your system. Once you practice running commands and learn how pathname expansions work, you'll understand why I gave that list of utility names AND you'll recognize that [bash] and [/bash] are tags used in the book you were looking at indicating that stuff between those tags is a command to be given to the bash shell command language interpreter; not part of the text of the commands themselves.

Note however, that running a mv command will actually move files. So do this in a directory with test files you have set up with matching names; do not try this in a directory where you have files with names ending in .bvec or .bval and directories named bvecs and bvals unless you actually want to move all of the files with names ending in .bval to the directory named bvals and want to move all of the files with names ending in .bvec to the directory named bvecs .

If those directories don't exist, what happens will vary depending on how many files match those filename patterns. Why don't you look at the mv man page and tell me what will happen in each of the following cases when you run the command:

mv *.bvec bvecs

and:

  1. no file matches the pattern *.bvec and there is a directory named bvecs ,
  2. no file matches the pattern *.bvec and there is a regular file named bvecs ,
  3. no file matches the pattern *.bvec and there is no file named bvecs ,
  4. one file matches the pattern *.bvec and there is a directory named bvecs ,
  5. one file matches the pattern *.bvec and there is a regular file named bvecs ,
  6. one file matches the pattern *.bvec and there is no file named bvecs ,
  7. two or more files match the pattern *.bvec and there is a directory named bvecs ,
  8. two or more files match the pattern *.bvec and there is a regular file named bvecs , and if
  9. two or more files match the pattern *.bvec and there is no file named bvecs .

Hello,

Wow! Thank you for taking the time to write great explanations! I have printed it out and will re-read everything to really understand it.

Instead of creating new threads, is it okay if I keep posting new questions in this thread?

If so, I have a question on creating variables. I have tried to research and come up with a solution (to the text quoted below) on my own but I do not know if I am on the right track. If possible could you tell me if I am moving in the right direction?

I have come up with the following: though I am not sure what is meant by using the 'set' command.

#/bin/bash
eddy_file=`cat eddy_corrected_brain.txt`
echo $eddy_file

The eddy_corrected_brain.txt is a text file with pathnames of all the files that match eddy_corrected_brain.nii.gz from the initial find command/results.

Thank you again!

If you're satisfied with a solution or an answer, please tag the thread "solved" (see the top of this thread). And, for a new problem or question, open a new thread with a meaningful title.

Regarding your new question above, I'm lost. Please rephrase in a new thread.

Thank you all for the help!