Bash for loop array

mwheeler12 · March 13, 2018, 3:06pm

Hi there,

A bit new to bash and am having an issue with a for loop. I look for filenames in a specified directory and pull the date string from each meeting a certain criteria, and then would like to make a directory for each date found, like this:

search 20180101.gz 20180102.gz 20180103.gz
store 20180101 20180102 20180103 in an array
mkdir for each value found

The problem is that the mkdir part is only making a directory for the first value found.

This is the actual code:

for line in $(echo ${DPPATH}/${DBNAME}*.dmp.gz ${MTIMECMD} -type f |grep -Eo '[[:digit:]]{8}')
do
find ${DPPATH}/${DBNAME}*.dmp.gz ${MTIMECMD} -type f -exec /usr/local/bin/aws s3 mv {} "s3://"${S3BUCKET}/export_area/${DBNAME}/${line}/ \;
done

Any input would be greatly appreciated.

bakunin · March 13, 2018, 8:50pm

mwheeler12:

search 20180101.gz 20180102.gz 20180103.gz
store 20180101 20180102 20180103 in an array
mkdir for each value found
The problem is that the mkdir part is only making a directory for the first value found.

First off: you won't need an array at all, so is there any other reason (outside of you thinking you'd need one) for it?

If not: a for-loop gets a list of values and runs the loops body with each of it. Here is an example:

for LOOPVAR in first second third fourth ; do
     echo "LOOPVAR is: $LOOPVAR"
done

The loops body is the echo-statement here and as you can see the variable LOOPVAR is assigned one value after the other. You can not only use a static list to fill this variable but also a (so-called) "fileglob": a pattern with wildcards, which will be expanded to a list of filenames fitting this pattern:

for LOOPVAR in *dmp.gz ; do
     echo "LOOPVAR is: $LOOPVAR"
done

Run this in the right directory (i have ommitted pathes for readability) and you will see that LOOPVAR is assigned a list of filenames consecutively.

Now you are left with two tasks: first, create the directory name from the filename (if i have read your problem sttement correctly you want to create a directory "foo" for a file named "foo.dmp.gz" found, yes?) and secondly, replace the "echo"-command with the "mkdir"-command. It is always a good practice to first try such constructs with "echo" and only as the last step replace it with the real command.

For the change of the filename to the directory name we use "variable expansion" and you may want to read up upon it. It is a versatile device and you should know about it:

for FILENAME in *dmp.gz ; do
     echo "FILENAME is: $FILENAME, DIRNAME is: ${FILENAME%.dmp.gz}"
done

${variable%pattern} means: print the content of variable but omit an eventual pattern from the end of it: "foopattern" -> "foo".

We have all in place now, so we put in our mkdir command:

for FILENAME in *dmp.gz ; do
     mkdir "${FILENAME%.dmp.gz}"
done

That's it. In case you want to create the directory somewhere else:

for FILENAME in *dmp.gz ; do
     mkdir "/some/where/else/${FILENAME%.dmp.gz}"
done

I hope this helps.

bakunin

mwheeler12 · March 20, 2018, 7:37am

Thanks bakunin, I appreciate the reply.

I am still having trouble getting the loop to work for each date pulled from the filename using this code:

for line in $(echo ${DPPATH}/${DBNAME}*.dmp.gz ${MTIMECMD} -type f |grep -Eo '[[:digit:]]{8}')
do
find ${DPPATH}/${DBNAME}*.dmp.gz ${MTIMECMD} -type f -exec /usr/local/bin/aws s3 mv {} "s3://"${S3BUCKET}/export_area/${DBNAME}/${line%}/ \;
done

My expectation is:

For each file (*.dmp.gz), pull the YYYYMMDD date older than MTIMECMD - this works.
For each date matching the criteria, execute /usr/local/bin/aws s3 mv command - this down not work. It does not 'break' from one filename date to the next:

+ echo /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz -type f
+ find /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz -type f -exec /usr/local/bin/aws s3 mv '{}' <path_removed>/20180312/ ';'
Completed 1 of 2 part(s) with 1 file(s) remaining
Completed 2 of 2 part(s) with 1 file(s) remaining
move: ./d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz to <path_removed>/20180312/d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz
Completed 1 of 2 part(s) with 1 file(s) remaining
Completed 2 of 2 part(s) with 1 file(s) remaining
move: ./d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz to <path_removed>/20180312/d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz
+ find '/backup01/export_area/d1bebo/d1bebo*.dmp.gz' -type f -exec /usr/local/bin/aws s3 mv '{}' <path_removed>/20180314/ ';'
find: �/backup01/export_area/d1bebo/d1bebo*.dmp.gz': No such file or directory

Here, I find 2 files. One dated 20180312 and one 20180314. The first file is moved to the 20180312 directory as expected, but then, the second file is also moved to that directory. The piece of code then tries to move the file dated 20180314 to the 20180314 directory, but can't find it since it has already been moved.

Any suggestions would be greatly appreciated. Thanks again for the help.

mwheeler12 · March 26, 2018, 11:32am

Looking for a suggestion. Any input would be greatly appreciated. It is driving me nuts.

Don_Cragun · March 27, 2018, 12:22am

mwheeler12:

Thanks bakunin, I appreciate the reply.

I am still having trouble getting the loop to work for each date pulled from the filename using this code:
for line in $(echo ${DPPATH}/${DBNAME}*.dmp.gz ${MTIMECMD} -type f |grep -Eo '[[:digit:]]{8}')
do
find ${DPPATH}/${DBNAME}*.dmp.gz ${MTIMECMD} -type f -exec /usr/local/bin/aws s3 mv {} "s3://"${S3BUCKET}/export_area/${DBNAME}/${line%}/ \;
done
My expectation is:

For each file (*.dmp.gz), pull the YYYYMMDD date older than MTIMECMD - this works.

For each date matching the criteria, execute /usr/local/bin/aws s3 mv command - this down not work. It does not 'break' from one filename date to the next:
+ echo /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz -type f
+ find /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz /backup01/export_area/d1bebo/d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz -type f -exec /usr/local/bin/aws s3 mv '{}' <path_removed>/20180312/ ';'
Completed 1 of 2 part(s) with 1 file(s) remaining
Completed 2 of 2 part(s) with 1 file(s) remaining
move: ./d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz to <path_removed>/20180312/d1bebo_metadata_exp_schema_exp_01_20180312_1824.dmp.gz
Completed 1 of 2 part(s) with 1 file(s) remaining
Completed 2 of 2 part(s) with 1 file(s) remaining
move: ./d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz to <path_removed>/20180312/d1bebo_metadata_exp_schema_exp_01_20180314_1130.dmp.gz
+ find '/backup01/export_area/d1bebo/d1bebo*.dmp.gz' -type f -exec /usr/local/bin/aws s3 mv '{}' <path_removed>/20180314/ ';'
find: �/backup01/export_area/d1bebo/d1bebo*.dmp.gz': No such file or directory
Here, I find 2 files. One dated 20180312 and one 20180314. The first file is moved to the 20180312 directory as expected, but then, the second file is also moved to that directory. The piece of code then tries to move the file dated 20180314 to the 20180314 directory, but can't find it since it has already been moved.

Any suggestions would be greatly appreciated. Thanks again for the help.

From your trace output we can see that the variable MTIMECMD is either unset or set to an empty string. We, therefore, know that your statement:

is wrong. The values extracted by your for loop are in no way related to any date (since no date is specified by that variable). If MTIMECMD had been set to something like:

MTIMECMD="! -newer pathname"

where pathname is the pathname of a file you wanted to use to select gzip ped dump files that are not newer than you want to select, you would still have the problem that the echo command in your for loop doesn't understand find primitives as a means to select parameters to be echo ed. And, even if it did, as you have already found, there is nothing in your find command inside your for loop that makes any attempt to select files destined for a particular date from being selected to be moved to a different directory.

Maybe something more like the following would work better. Note, however, that the following is totally untested.

IAm=${0##*/}
tmpfile=$IAm.$$
DBNAME="something"
DPPATH="/what/ever"
MTIMECMD="! -newer /some/pathname"
S3BUCKET="who_knows"

trap 'rm -f "$tmpfile"' EXIT

for datestamp in $(find "$DPPATH" -name "/${DBNAME}*.dmp.gz" $MTIMECMD -type f | tee "$tmpfile" | grep -Eo '[[:digit:]]{8}' | sort -u)
do	# The following assumes that your aws command creates the needed datestamp directories for you.
	grep "$datestamp" "$tmpfile" | while IFS="" read -r path
	do	/usr/local/bin/aws s3 mv "$path" "s3://$S3BUCKET/export_area/$DBNAME/$datestamp/"
	done
done

mwheeler12 · March 27, 2018, 10:50am

Thanks a lot. I changed it up a little to reformat the date. This is the complete solution:

for datestamp in $(find "$DPPATH" -name "${DBNAME}*.dmp.gz" $MTIMECMD -type f | tee "$tmpfile" | grep -Eo '[[:digit:]]{8}' | sort -u)
do
        grep "$datestamp" "$tmpfile" | while IFS="" read -r path
        do /usr/local/bin/aws s3 mv "$path" "s3://$S3BUCKET/export_area/$DBNAME/${datestamp:0:4}-${datestamp:4:2}-${datestamp:6:2}/"
        done
done

Thanks again.