In directory /mnt/upload I have about 100 000 files (*.png) that have been created during the last six months. Now I need to move them to right folders. eg:
file created on 2014-10-10 move to directory /mnt/upload/20141010
file created on 2014-11-11 move to directory /mnt/upload/20141111
file created on 2014-08-01 move to directory /mnt/upload/20140801
So the script first creates a directory with the correct name ( format +%Y%m%d) move all files in a day (but only files not directory) in to right directory.
Yes, not a too efficient one though... Use it at your own risk. All commands are supposed to be run in the /mnt/upload directory.
A proper backup of those png files (with preserved modification timestamps) is highly recommended before attempting to perform the mass mv operation.
# get last modification date + filename
# sample output line: 2014-09-02 16:47:34.973268280 +0200 ./test.png
find . -maxdepth 1 -type f -name '*.png' -exec stat -c '%y %n' {} + >temp1.txt
# cut + reformat last modification date from YYYY-MM-DD to YYYYMMDD, cut filename
# sample output line: 20140902,test.png,
sed 's/^\(....\)-\(..\)-\(..\).*\/\(.*\)$/\1\2\3,\4,/' temp1.txt | sort >temp2.txt
# assembling and executing mv commands
# sample output line: mv "test.png" 20140902
# try without "| sh" first to see if the output is correct!
awk -F, '{ printf "mv \"%s\" %s\n", $2, $1 }' temp2.txt | sh
DEST=/path/to/
find /opt/camera/ -type f -printf '%TY%Tm%Td %p\n' |
while read -r DATE FILE
do
[ -d "$DEST/$DATE" ] || mkdir -p "$DEST/$DATE"
# Remove the 'echo' once you've tested and seen it does what you want.
echo mv "$FILE" /path/to/$DATE/
done
I could make it more efficient, but the real inefficiency is going to be that one dir with 100,000 files, which there is absolutely no way to speed up... Each relink() has to lock and update that one huge folder.
The big time cost is finding and removing them from the source dir. I suspect it might be fastest to move each file found in order, so the directory entry is the first one checked, or nearly so, each time a file is moved.
You might hard link them rather than move them, a sort of half move, no source directory rewriting but still a long search for file entries in there. Once they are all linked, you can delete all the old links in the source dir.
I'd use a find -ls and figure the destination from that in sed, not the time arguments.
Wow, needs some indentation and formatting for ease of maintenance! You, too, deserve pretty code. It's an investment in your future.
Moving all the files for one destination might have been an economy, using the mkdir -p line extended.
#!/bin/bash
cd /mnt/upload/
export d_last=''
ls -pl1 --time-style='+%Y%m%d' | awk '/-.*\.png$/{print $6,$7}' | ( sort ; echo x x) | while read d f
do
if [ "$d" != "$d_last" ]
then
if [ $d_last != '' ]
then
mkdir -p /somewhere/$d_last
mv $fs /somewhere/$d_last
fi
d_last=$d
fs=$f
continue
fi
fs="$fs $f"
done
Note the need for a dummy last line to flush the last dir, and presetting the d_last to deal with the first line. Loops are neat except dealing with the ends.
Bash/ksh/awk/PERL have associative arrays that might be used in place of the sort with less loop ends stuff. All the file names would be stored under the date, and then at EOF you can go through the array key dates making dirs and moving files.