cmccabe
November 23, 2016, 7:20pm
1
I am trying to use bash
to remove the text in all filenames after the second _
in specific files that end in .bam
or .vcf
. However the bash
errors looking for the files in the directory. Thank you :).
files in directory
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
desired output
IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
Bash
for file in /home/cmccabe/Desktop/NGS/test/$filename/{*.bam *.vcf}; do
mv -- "$file" "${file%%_*_*}.bam .vcf"
done
joker
November 23, 2016, 7:39pm
2
Maybe this one:
file=IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
echo ${file//${file#*_[0-9][0-9][0-9]}}.bam
#Output
IonXpress_008.bam
Have fun with figuring out, how it's working.
This one is a bit more flexible:
file=IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
file2="${file//${file#*_*_}}"
echo "${file2:0: -1}.bam"
#Output
IonXpress_008.bam
1 Like
Aia
November 24, 2016, 12:22am
3
cmccabe:
I am trying to use bash
to remove the text in all filenames after the second _
in specific files that end in .bam
or .vcf
. However the bash
errors looking for the files in the directory. Thank you :).
files in directory
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
desired output
IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
Bash
for file in /home/cmccabe/Desktop/NGS/test/$filename/{*.bam *.vcf}; do
mv -- "$file" "${file%%_*_*}.bam .vcf"
done
Please try the following and if you like the result modify the highlighted echo
to be just the mv
command
for f in /home/cmccabe/Desktop/NGS/test/${filename}/*.{bam,vcf}; do
if [[ -e $f ]]; then
p="${f%/*}"
i="${f#$p/}"
e="${f##*.}"
ci="${i%%_[A-Z]*}"
echo mv -v "$f" "${p}/${ci}.${e}"
fi
done
1 Like
drl
November 24, 2016, 7:07am
4
Hi.
I like the rename
command:
rename perl-expression files
Here's how it could work:
#!/usr/bin/env bash
# @(#) s2 Demonstrate group rename with perl regular expression, rename.
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C rename pass-fail dixf
dixf mv rename pass-fail dixf
FILE=${1-data1}
rm Ion*
E=expected-output.txt
pl " Input data file $FILE, creating files:"
touch $( cat $FILE )
ls Ion*
pl " Expected output:"
cat $E
pl " Results:"
# Pattern and sample match; ".*?" is a non-greedy match.
#
# IonXpress_007
# ^(.*?_.*?)
# _MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome
# _.*
# .bam
# [.]
rename 's/^(.*?_.*?)_.*[.]/$1./' *bam *vcf
( ls -1 Ion*bam
ls -1 Ion*vcf
ls -1 Ion*txt ) |
tee f1
pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C || ( pe; pe " Results cannot be verified." ) >&2
exit 0
producing:
$ ./s2
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.6 (jessie)
bash GNU bash 4.3.30
rename /usr/bin/rename using File::Rename version 0.20
pass-fail (local) 1.9
dixf (local) 1.19
mv move (rename) files (man)
Path : /bin/mv
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Help : probably available with --help
rename renames multiple files (man)
Path : /usr/bin/rename
Type : symbolic link to /etc/alternatives/rename ...)
Modules : (for perl codes)
strict 1.08
File::Rename 0.20
Pod::Usage 1.63
pass-fail Compare files, issue pass or fail verdict, <f1>, <expected-output.txt>. (what)
Path : ~/bin/pass-fail
Length : 90 lines
Type : a /usr/bin/env bash script, ASCII text executable
Shebang : #!/usr/bin/env bash
dixf Display information about executable file, script or binary. (what)
Path : ~/bin/dixf
Length : 153 lines
Type : a /usr/bin/env bash script, ASCII text executable
Shebang : #!/usr/bin/env bash
Help : probably available with --help,--man,--usage
-----
Input data file data1, creating files:
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.bam
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.vcf
-----
Expected output:
IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
-----
Results:
IonXpress_007.bam
IonXpress_008.bam
IonXpress_009.bam
IonXpress_007.vcf
IonXpress_008.vcf
IonXpress_009.vcf
IonXpress_007_MEVxx_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_008_MEVxy_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
IonXpress_009_MEVxz_R_2016_11_18_10_45_10_user_S5-00580-14-Medexome.txt
-----
Verify results if possible:
-----
Comparison of 9 created lines with 9 lines of desired results:
Succeeded -- files (computed) f1 and (standard) expected-output.txt have same content.
Different repositories may have different versions of rename
.
Best wishes ... cheers, drl
1 Like
cmccabe
November 24, 2016, 8:23am
5
Thank you all very much and enjoy your holiday :).