In the below bash I am trying to ensure that all folders (represented by $folders) in a given directory are created. In the file f1 the trimmed folder will be there somewhere (will be multiple trimmed folders).
When that trimmed folder is found (represented by $S5) the the contents of $2 printed to $STRING . If $STRING is the same as $folders then
that folder is already created in the directory and it can be skipped (this will be the case most of the time). If $STRING is not the same as $folders then the missing is created in the directory. The code seems to work as expected untill the If done , that is the variables are set correctly, but the actual match is portion below is not working and I am not sure how to mkdir . Thank you :).
DIR=/home/cmccabe/folder
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
S5=${TRIMSTR##*/} ## store run with no path as S5
# Iterate over each sample folder in run directory
cd "$TRIMSTR"
folders=folders=`ls -d *[0-9]*/` ## list only directories begining with digit
done
for f in "$DIR"/f1; do STRING=$(awk -v ref="/$S5/" 'match($0, ref) {print $2}' "$f"); done
nl=" ## added newline boundry seperator
"
if [[ $nl$folders$nl == *$n$STRING/$nl* ]] || continue # only execute file on match
else mkdir $STRING
fi
directory structure where variables come from
/path/to/run/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6_xxx_xxx_xxx_xxx_xxx_xxx_xxxx
--- /path/to/run/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123 --- this is $TRIMSTR and the blue is $S5
--- 00-1111-xxx-xxx 00-2222-yyy-yy 00-3333-zz-zz id test --- these are within $TRIMSTER id and test are not stored in $folders
You have a done statement on line 11 that doesn't match to anything.
S5=$(cut -d/ -f6 <<<"$TRIMSTR") ## store run with no path as S5
S5 will be empty because TRIMSTR doesn't contain 6 entries.
if [[ $folders = ${STRING[*]} ]]; then # only execute file on match
folders will contain space separated list of folders with a digit in their name, STRING is not an array, what are you trying to match here?
I made some edits to the code and am trying to match $TRIMSTER to the line in file f1 . I think I should be looking for the pattern as f1 the file name is longer but includes $TRIMSTER. When a match is foung the values in$2 up to the empty line (which always seperates each block). I though those were being read into array $STRING which is then compared to $folders which is the sud-folders within $TRIMSTER . If $STRING[*] = $folders that means it is already there and can be skipped. However, if it is not equal then it is not there and a new folders is made in $TRIMSTR. . Thank you :).
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
test -d "$RDIR" || continue ## is a directory, or jump to the next cycle
As you see it is continue not next.
S5=${TRIMSTR##*/} ## strip path and store in S5
folders=`ls -d *[0-9]*/`
The trailing / ensures only directories match. But the trailing / is now part of the names.
To match a string in a list you can use a case-esac or as you tried, a [[ == ]]
The RHS takes the * wildcards.
For an exact match use boundaries. Here $folders is newline-separated, so we enclose the STRING in newlines, and we must enclose the LHS (folder list) in newlines so the first and last element can be matched.
nl="
"
if [[ $nl$folders$nl == *$n$STRING$nl* ]]
If folder names have a trailing / you can add it to the STRING
I updated the script in the original post with the helpful suggestions and added set -x . There are some syntax error and the $STRING variable doesn't get populated, though it looks like the expected values are being used. Thank you :).
set -x
cmccabe@Satellite-M645:~$ DIR=/home/cmccabe/folder
+ DIR=/home/cmccabe/folder
cmccabe@Satellite-M645:~$
cmccabe@Satellite-M645:~$ for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
> TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
> mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
> S5=${TRIMSTR##*/} ## store run with no path as S5
> # Iterate over each sample folder in run directory
> cd "$TRIMSTR"
> folders=folders=`ls -d *[0-9]*/` ## list only directories begining with digit
> done
+ for RDIR in '"$DIR"/R_2019*'
+ TRIMSTR=/home/cmccabe/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123
+ mv /home/cmccabe/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6_xxx_xxx_xxx_xxx_xxx_xxx_xxxx /home/cmccabe/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123
+ S5=R_2019_00_00_00_00_00_xxxx_x0-00000-123
+ cd /home/cmccabe/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123
++ ls --color=auto -d 00-1111-xxx-xxx/ 00-2222-yyy-yy/
+ folders='folders=00-1111-xxx-xxx/
00-2222-yyy-yy/'
cmccabe@Satellite-M645:~/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123$ for f in "$DIR"/f1; do STRING=$(awk -v ref="/$S5/" 'match($0, ref) {print $2}' "$f"); done
+ for f in '"$DIR"/f1'
++ awk -v ref=/R_2019_00_00_00_00_00_xxxx_x0-00000-123/ 'match($0, ref) {print $2}' /home/cmccabe/folder/f1
+ STRING=
cmccabe@Satellite-M645:~/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123$ nl="
> "
+ nl='
'
cmccabe@Satellite-M645:~/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123$ if [[ $nl$folders$nl == *$n$STRING/$nl* ]] || continue # only execute file on match
> else mkdir $STRING
bash: syntax error near unexpected token `else'
cmccabe@Satellite-M645:~/folder/R_2019_00_00_00_00_00_xxxx_x0-00000-123$ fi
The syntax requires then between if and else.
And what does the for loop with STRING?
You probably want
## added newline boundry seperator
nl="
"
for f in "$DIR"/f1
do
STRING=$(awk -v ref="/$S5/" 'match($0, ref) {print $2}' "$f")
if [[ $nl$folders$nl != *$n$STRING/$nl* ]] ## only execute file on match
then
mkdir $STRING
fi
done
Note that in [[ ]] the == operator is a glob match, and != is the opposite (no match). Some shells allow = instead of == but == is the standard in [[ ]].
(While in [ ] the = operator is true if the strings are equal, and the != is the opposite (not equal).) Some shells allow == instead of = but = is the standard in [ ].
set x showed the same as the previous post, I initially thought there may be multiple files so used a for , but quickly realized that wasn't going to work. Is there a better approach? Thank you :).
for f in "$DIR"/all ; do
> STRING=$(awk -v ref="/$S5/" 'match($0, ref) {print $2}' "$f")
> if [[ $nl$folders$nl != *$n$STRING/$nl* ]] ## only execute file on match
> then
> mkdir -p $STRING
> fi
> done
+ for f in '"$DIR"/f1'
++ awk -v ref=/R_2019_00_00_00_00_00_xxxx_x0-00000-123/ 'match($0, ref) {print $2}' /home/cmccabe/Desktop/new/f1
+ STRING=
+ [[
!= */
* ]]
+ mkdir -p
I made some edits to the code and am closer but am not performing the $STRING to $folder comparison correctly and the wrong folder gets created. I don't know if there is a better way but hopefullt I am closer Thank you :).
DIR=/home/cmccbe/folder
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
S5=${RDIR##*/} ## store run with no path as S5
TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
cd "$TRIMSTR"
folders=`ls -d *[0-9]*/` ## list folders
STRING=$(awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {print $2}' "$DIR"/f1)
if [[ $folders == $STRING* ]] ## only execute folder on match
then
continue
if [[ $folders != $STRING* ]] ## only execute folder on mis-match
then
mkdir -p "$TRIMSTR"/variants
fi
fi
done
I made some edits to the code and am closer but am not performing the $STRING to $folder comparison correctly and the wrong folder gets created. Thank you :).
DIR=/home/cmccbe/folder
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
S5=${RDIR##*/} ## store run with no path as S5
TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
cd "$TRIMSTR"
folders=`ls -d *[0-9]*/` ## list folders
STRING=$(awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {print $2}' "$DIR"/f1)
if [[ $folders != $STRING* ]] ## only execute folder on mis-match
then
mkdir -p "$TRIMSTR"/$STRING/folder
fi
done
DIR=/home/cmccabe/Desktop/new --- define directory ---
for RDIR in "$DIR"/R_2019* ; do --- /home/cmccabe/Desktop/new/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6_xxx_xxx_xxx_xxx_xxx_xxx_xxxx ---
S5=${RDIR##*/} --- R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6_xxx_xxx_xxx_xxx_xxx_xxx_xxxx ---
TRIMSTR=${RDIR%%-v5.6*} --- --- R_2019_00_00_00_00_00_xxxx_x0-00000-123 ---
mv "$RDIR" "$TRIMSTR" --- /home/cmccabe/Desktop/new/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6_xxx_xxx_xxx_xxx_xxx_xxx_xxxx >>> /home/cmccabe/Desktop/new/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6
cd "$TRIMSTR" --- look in /home/cmccabe/Desktop/new/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6 ---
folders=`ls -d *[0-9]*/` --- 00-1111-xxx-xxx/ 00-2222-yyy-yy/ ---
STRING=$(awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {print $2}' "$DIR"/f1) --- match R_2019 from $S5 line in f1 and print 00-1111-xxx-xxx/ 00-2222-yyy-yy 00-3333-zz-zz to $STRING --
if [[ $folders == $STRING/* ]] ## only folder on match --- 00-1111-xxx-xxx/ 00-2222-yyy-yy are skipped as they match $STRING ---
then
continue
if [[ $folders != $STRING/* ]] ## only execute folder on mis-match --- 00-3333-zz-zz is not equal to $STRING ---
then
mkdir -p "$TRIMSTR"/variants --- 00-3333-zz-zz/variants is made in /home/cmccabe/Desktop/new/R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6 ---
fi
fi
done
I'm still a little confused on what you want to happen.
You end up with variables as follows:
folders
00-1111-xxx-xxx/
00-2222-yyy-yy/
string
zzzz_0005 00-1111-xxx-xxx
You seem to imply there is some sort of looping thru the folders variable and a comparison with string which then results in creating folders. However you are just performing a single comparison between the two variables.
Yes that is correct. If the $folder variable does not match $STRING then it does not exist, so it is created (this will happen ocassionally). If the $folder variable does match $STRING then it already exists and can be skipped (this will be the majority of the time). Thank you :).
DIR=/home/cmccbe/folder
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
S5=${RDIR##*/} ## store run with no path as S5
TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
cd "$TRIMSTR"
STRING=$(awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {print $2}' "$DIR"/f1)
for folder in *[0-9]*/ ; do
folder=${folder%/} ## remove trailing slash
if ! [[ " $STRING " == *\ $folder\ * ]]
then
## folder is not within $STRING so make variants folder and leave
mkdir -p "$TRIMSTR"/variants
break
end
done
done
I added set -x and the $STRING variable looks to contain only one of the $2 values from f1 , instead of the three so the directory created is not the right one. Thank you very much :).
f1
yyy_009 00-0000-xxx-xxx-xxx
yyy_004 00-0011-xxx-xxx-xxx
R_2019_00_00_00_00_00_xxxx_x0-00000-200_yyyyyyy
zzzz_0000 00-3333-zz-zz --- not printed to $STRING currently, but should be for the comparison ---
zzzz_0005 00-1111-xxx-xxx --- printed to $STRING currently ---
zzzz_0003 00-2222-yyy-yy --- not printed to $STRING currently, but should be for the comparison ---
R_2019_00_00_00_00_00_xxxx_x0-00000-123-v5.6_xxx_xxx_xxx_xxx_xxx_xxx_xxxx ---- this is matched ---
$folder
zzzz_0005 00-1111-xxx-xxx --- this is correct ---
zzzz_0003 00-2222-yyy-yy
so since
00-3333-zz-zz
is not in folder 00-3333-zz-zz/variants is created
The below shows that the $STRING variable has all the reords in it and $folder contains 00-1111-xxx-xxx and 00-2222-yyy-yy , but no directory is created for 00-3333-zz-zz/variants . Since 00-3333-zz-zz is in $STRING but not in $folder , 00-3333-zz-zz/variants is used by mkdir -p
I also included what I think the awk line is doing, is it correct? Thank you :).
STRING=$(awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {d=split($0, val, " "); for(i=2;i<d;i+=2) printf "%s ",val; printf "\n"}' "$DIR"/f1)
use each newline as a seperator and then a space for each record. look for the $S5 variable in f1 and once found, loop through $2 printing them using val and storing them in $STRING seperated by a space
Awk code uses blank line as record separator and new line as field separator. matching record that contains ref and then each field (line if record) is split and the 2nd word printed with spaces.
I thought you wanted to create create variants if a folder existed that wasn't in the STRING variable.
for matching from STRING and looking for folders you would need this:
for folder in $STRING
do
## if $folder does not exist then make $folder/variants
[ -d "$folder" ] || mkdir -p "$folder"/variants
done
Probably missing an obvious syntax error but another set of eyes will help. Thank you :).
DIR=/home/cmccabe/Desktop/new
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
S5=${RDIR##*/} ## store run with no path as S5
TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
cd "$TRIMSTR"
STRING=$(awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {d=split($0, val, " "); for(i=2;i<d;i+=2) printf "%s ",val; printf "\n"}' "$DIR"/f1)
for folder in *[0-9]*/ ; do
folder=${folder%/} ## remove trailing slash
if ! [[ " ${STRING/[*]} " == *\ $folder\ * ]]
then
for folder in $STRING ; do
## if $folder does not exist then make
[ -d "$folder" ]
mkdir -p "$TRIMSTR"/"$folder"/variants
break
end
fi
done
done
done
bash: syntax error near unexpected token `fi'
cmccabe@xxx:~$ done
bash: syntax error near unexpected token `done'
cmccabe@xxx:~$ done
bash: syntax error near unexpected token `done'
cmccabe@xxx:~$ done
This intent was to replace the original for loop like this:
DIR=/home/cmccabe/Desktop/new
for RDIR in "$DIR"/R_2019* ; do ## # matching "R_2019*" to operate on desired directory and expand
S5=${RDIR##*/} ## store run with no path as S5
TRIMSTR=${RDIR%%-v5.6*} ## trim folder match in RDIR from -v5.6 and store in TRIMSTR
mv "$RDIR" "$TRIMSTR" ## move trimmed folder to directory
cd "$TRIMSTR"
STRING=$(
awk -F '\n' -v RS="" -v ref="$S5" '$0 ~ ref {d=split($0, val, " "); for(i=2;i<d;i+=2)
printf "%s ",val; printf "\n"}' "$DIR"/f1 )
for folder in $STRING
do
## if $folder does not exist then make $folder/variants
[ -d "$folder" ] || mkdir -p "$folder"/variants
done
done