Bash to rename file after second occurence of underscore

I am trying to use bash to remove the text in all filenames after the second _ in specific files that end in .bam or .vcf . However the bash errors looking for the files in the directory. Thank you :).

files in directory

IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt

desired output

IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt

Bash

for file in /home/cmccabe/Desktop/NGS/test/$filename/{*.bam *.vcf}; do
   mv -- "$file" "${file%%_*_*}.bam .vcf"
done

Maybe this one:

file=IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
echo ${file//${file#*_[0-9][0-9][0-9]}}.bam

#Output 
IonXpress_008.bam

Have fun with figuring out, how it's working. :smiley:

This one is a bit more flexible:

file=IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
file2="${file//${file#*_*_}}"
echo "${file2:0: -1}.bam"

#Output 
IonXpress_008.bam
1 Like

Please try the following and if you like the result modify the highlighted echo to be just the mv command

for f in /home/cmccabe/Desktop/NGS/test/${filename}/*.{bam,vcf}; do
 if [[ -e $f ]]; then
   p="${f%/*}"
   i="${f#$p/}"
   e="${f##*.}"
   ci="${i%%_[A-Z]*}"
   echo mv -v "$f" "${p}/${ci}.${e}"
 fi
done
1 Like

Hi.

I like the rename command:

rename perl-expression files

Here's how it could work:

#!/usr/bin/env bash

# @(#) s2       Demonstrate group rename with perl regular expression, rename.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C rename pass-fail dixf
dixf mv rename pass-fail dixf

FILE=${1-data1}
rm Ion*
E=expected-output.txt

pl " Input data file $FILE, creating files:"
touch $( cat $FILE )
ls Ion*

pl " Expected output:"
cat $E

pl " Results:"
# Pattern and sample match; ".*?" is a non-greedy match.
#
# IonXpress_007
# ^(.*?_.*?)
#              _MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome
#              _.*
#                                                                    .bam
#                                                                   [.]

rename 's/^(.*?_.*?)_.*[.]/$1./' *bam *vcf
( ls -1 Ion*bam
ls -1 Ion*vcf
ls -1 Ion*txt ) |
tee f1

pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C || ( pe; pe " Results cannot be verified." ) >&2

exit 0

producing:

$ ./s2

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.6 (jessie) 
bash GNU bash 4.3.30
rename /usr/bin/rename using File::Rename version 0.20
pass-fail (local) 1.9
dixf (local) 1.19

mv      move (rename) files (man)
Path    : /bin/mv
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Help    : probably available with --help

rename  renames multiple files (man)
Path    : /usr/bin/rename
Type    : symbolic link to /etc/alternatives/rename ...)
Modules : (for perl codes)
 strict 1.08
 File::Rename   0.20
 Pod::Usage     1.63

pass-fail       Compare files, issue pass or fail verdict, <f1>, <expected-output.txt>. (what)
Path    : ~/bin/pass-fail
Length  : 90 lines
Type    : a /usr/bin/env bash script, ASCII text executable
Shebang : #!/usr/bin/env bash

dixf    Display information about executable file, script or binary. (what)
Path    : ~/bin/dixf
Length  : 153 lines
Type    : a /usr/bin/env bash script, ASCII text executable
Shebang : #!/usr/bin/env bash
Help    : probably available with --help,--man,--usage

-----
 Input data file data1, creating files:
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf

-----
 Expected output:
IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt

-----
 Results:
IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt

-----
 Verify results if possible:

-----
 Comparison of 9 created lines with 9 lines of desired results:
 Succeeded -- files (computed) f1 and (standard) expected-output.txt have same content.

Different repositories may have different versions of rename .

Best wishes ... cheers, drl

1 Like

Thank you all very much and enjoy your holiday :).