Delete all files created in specific year

I have more than 200K files created in year 2017 under directory having size of 50GB.
I want to all these files in one shot.
Is there any faster option available with find command to delete all these file ?

Hi,

In theory, this should be fairly straightforward. If you want to delete all files created from 2017 and beyond (i.e. also files from 2016, 2015, etc - anything older than 2017), then you could do something like this:

find /path/to/files -type f -mtime +549 -exec rm -fv \{\} \;

That would find all files older than 549 days (at the time of writing, the number of days since December 31st 2017) and remove them.

Now, if you really want to only remove files from 2017, and only 2017, then it's slightly more complicated, but not much so. Assuming your version of find supports it, this would do the trick:

find /path/to/files -type f -newermt 2017-01-01 \! -newermt 2018-01-01 -exec rm -fv \{\} \;

This will very specifically remove files whose modification time is newer than 1st January 2017, but not newer than 1st January 2018 (in other words - files from 2017).

Hope this helps. Please test carefully before running this with a live rm command - swap it out for an ls or switch out the -exec for a -print first just to be sure it really is going to catch what you want, and only what you want, before turning this loose on your filesystem for real.

2 Likes

Note that using:

find /path/to/files -type f -newermt 2017-01-01 \! -newermt 2018-01-01 -exec rm -fv \{\} \;

will invoke the rm utility once for each of the 200K files you want to remove. That is always going to be relatively slow. If you change that command to:

find /path/to/files -type f -newermt 2017-01-01 \! -newermt 2018-01-01 -exec rm -fv '{}' +

then the find utility will group together a large number of files to be processed by each invocation of the rm utility and run considerably faster.

And, as stated before but not told how, don't use rm live until you have verified that find is correctly selecting the files you want to process. I would suggest using:

find /path/to/files -type f -newermt 2017-01-01 \! -newermt 2018-01-01 -exec echo rm -fv '{}' +

first, and, if that produces commands that have correctly identified the files you want to remove, rerun the command without the echo .

2 Likes

@drysdalk @Don Cragun

Thank you both for your solutions. I have tested below command in one test directory having few files and it worked fine.

find /path/to/files -type f -newermt 2017-01-01 \! -newermt 2018-01-01 -exec rm -fv '{}' +

I was wondering if I can use -delete option instead of rm to fasten the process. Not really sure how much difference will it make.

find /path/to/files -type f -newermt 2017-01-01 \! -newermt 2018-01-01 -delete;

You can use that but be aware that it is NOT supported by standard-conforming versions of find . If you use it you limit the usability of what you do to systems where the installed find command is the same as in your local system.

In general experience shows that it is better to sacrifice simplicity for portability, so my suggestion would be to use what Don Cragun and drysdalk suggested, even if what you found would work on your system. It is never too early to start healthy habits.

I hope this helps.

bakunin

1 Like

While I agree with what bakunin said in post #5 in this thread, note that the find -newermt primary is not in the standards either.

This is why you are always asked to tell us the operating system and shell you're using when posting a thread in the UNIX for Beginners Questions and Answers forum. If we don't know what environment you're using, you're likely to get suggestions that won't work in your environment. Since what drysdalk suggested works in your environment, we will assume that you're currently using a BSD-based or GNU-based find utility. If you had been using a traditional UNIX system, you'd have gotten an error about the unknown primary -newermt .

Since -newermt worked, -delete will also probably work and will probably be slightly faster than -exec rm -fv '{}' + . (The difference between the speed of -delete and -exec rm -fv '{}' + will be MUCH smaller than the difference between -exec rm -fv '{}' + and -exec rm -fv '{}' \; .)

Doing this in a manner just using standard features (and a little more accurately setting the start and end points) would require something more like:

trap 'rm -rf $$.start $$.end' EXIT
touch -d 2016-12-31T23:59:59.999999999 $$.start
touch -d 2017-12-31T23:59:59.999999999 $$.end
find /path/to/files -type f -newer $$.start ! -newer $$.end -exec rm -rf '{}' +

Note, however, that the timestamps being used here accurately reflect the end points you want to the nearest nanosecond (which is the timestamp precision stored in a standard stat structure), but there is a reasonable likelihood that the last record entered into a logfile at the end of a year will not actually be written to that logfile until a few milliseconds (or even seconds) into the next year. And, that last write into the logfile will determine the last modified timestamp of that file; not the string specifying the date in the name of the logfile. (And this problem affects all of the methods we have talked about when using find -newer and -newermt . So beware that find may be doing exactly what you asked it to do, but it might not be exactly what you wanted it to do.

And, if anyone edits a logfile, the modification time of that file may be years later than the name of the file would indicate!

I hope this further clarification helps you understand the limitations on what you're doing.

1 Like

Thanks Don Cragun for clarification. This is really helpful .Cheers !!