I am trying to remove the filenames based on MMDDYYYY in the physical name as such so that the directory always has the recent 3 files based on MMDDYYYY. "HHMM" is just dummy in this case. You wont have two files with different HHMM on the same day.
When I run the script it should delete the OPEN_INV_01012011_1345.xls and OPEN_INV_01012011_3456.txt
Note that the before the file extension, we always have "MMDDYYYY_HHMM"
I am using the following script:
This is what I am trying to do;
#!/usr/bin/ksh
archivedir=/opt/data/files/archive
typeset -i MAX_ARCHIVE_COUNT
typeset -i archive_file_count
typeset -i remove_archive_count
cd $archivedir
for archpref in $(ls | sed 's/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\.[^.]*$//' | sort | uniq)
do
archive_file_count=$(ls -1t ${archpref}* | wc -l)
MAX_ARCHIVE_COUNT=3
remove_archive_count=${archive_file_count}-${MAX_ARCHIVE_COUNT}
if [ ${remove_archive_count} -gt 0 ]
then
# List the files in date order (most recent first), suppress the first 3, and delete the rest
rm $(ls -1rt | tail -${remove_archive_count})
fi
done
for archpref in $(ls | sed 's/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\.[^.]*$//' | sort | uniq)
Try running the commands in the parenthesis - starting with ls up to uniq - on the command line and see if it picks up any files. I believe the sed command should look like:
sed 's/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9]*\.[^.]*$//'
I tried all your suggestions and it is still not working. It considers .txt and .xls as same set and removes 5 files leaving the count as 3.
The idea is to leave 3 files for each set (.xls and .txt).
I think the issue is with the "sort and uniq"
Current it shows the archnt=10 and remove_count=7. Which is wrong as it combines both extensions.
It should actually be archnt=4 and remove_count=1 for each set.
Please if anyone can throw some ideas or modify the sed command will be great.
My earlier suggestion was based on your existing script. A quick (easy) solution would be to create 2 for loops, instead of the single one that you have now. The first one should only deal with .txt and the second, with .xls. So in the first for loop, your sed command would look something like this:
sed 's/[0-9]*\_[0-9]*\.txt//'
Make sure your rm command only removes the .txt files. So your rm command should look like this: