UNIX Script to clean files

Hello All,

I need a script that would delete files which are more than "X" number of days old, also if there can be a log file of the deleted files for reference.

I am from windows background hence finding it difficult. Any help is much appreciated

Regards
Wert

Doing it in windows means for most relying on GUI and not knowing there are utilities behind being used, though you have a "powershell" and command line /script option... in Unix you could find some GUI but most people learn to use the commands/utilities and keep GUI for sophisticated tasks...
I suggest you have a look at the man pages of find, start to elaborate your script and show us what you have done, then we can help you out or explain what is needed
If you dont know how to write the script then you should start by its design, and show us a form of pseudocode, we can discuss on, we then will help you put it down in shell script

Since I'm estimating that your knowledge of the unix system is very limited so far, I warn you to just delete any old unused file you can grab. Your request reminds me - at my - long time past windows times, where things had to be cleanup regularly, in order to maintain a roughly usable OS.

One should be familiar what can be files can deleted and what better not to touch. If not and if the deletions are done with root privileges, you can quite quickly render your system unusable.

So my advice: Before the learning of the knowledge how to delete is the learning of the knowledge what to delete.

So if you want help, specify your request with more details:

  • What exactly do you want to achieve?
  • What files do you want to delete?
  • What is your environment(OS, Version)?

First off, welcome to the forum as well as the family of the most powerful OSes there is in the world. You will find that Unix - any Unix, including Linux - is a set of finely tuned tools, just like an orchestra is a set of highly trained musicians. Let the right conductor - you - step up and they will blow the audience away.

Let us start right here, at your premise. In UNIX every file has not one but several timestamps. There is:

  • creation time
  • modification time
  • access time

and they are all set independently. You open a text editor and write a new file. All three of these times are set. After some time you open the file in a text editor again and change something - only the modification time and the access time (because to change it you need to read it first) is changed. Some time after this you display the files contents - only the access time is updated.

Also notice that many files on a UNIX system are important or even vital even if they are NOT updated regularly. I.e. the configuration file for a web server is being read when the web server starts, so its access time may be 2 months past if it runs for 2 months. You still shouldn't delete it, though, if you want to be able to start the webserver again. (notice that UNIX systems running for months or even years is - unlike Windows systems - rather normal. I have actually customers complain to me if i want to restart their server once a year after some major OS update. "You restarted already last year, why now again?" - no, i do NOT exaggerate here, i heard, word for word, exactly this complaint. In the OS i work with the most - AIX, IBMs UNIX - it is even possible to do OS and kernel updates under load with no interruption of the service. For exactly these situations where customers complain about the necessity reboot once a year or every other year.)

On the other hand, UNIX systems do not have "drives" but only one (uniform treelike) filesystem. So you may identify one or several branches in this tree where you want to start the cleaning operation and leave alone all the others.

In light of this you might want to rethink and restate your goals and we can discuss what might be done then.

I hope this helps.

bakunin

After having some time think this over i'd like to make some clarifications but, first off, thanks goes RudiC and drl for pointing it out to me that what i wrote was misleading, especially for the beginner. So, here it goes:

As i said, every file has several timestamps which stand for different things. But first, let us clarifiy what a "file" is in UNIX: a file is - obviously - a bunch of data stored on a "filesystem". There are (other than in Windows) different filesystems to choose from but it comes down to basically store the data somewhere in an organised form. But then there is a second aspect of this because we also need "meta data" - data about the data stored this way - to really make use of it. We i.e. need a "filename" by which we can address it. Notice that the files name is NOT part of the file - it is part of the set of descriptive items stored about the file.

So, we have the "file" and we have its "meta data". In UNIX filesystems these meta data are stored in "inodes". Inodes have a structured, record-like format which can (and does) vary across filesystems but within the same filesystem they are uniform. The different timestamps i talked about are located in these inodes.

I will skip (if you are interested - ask explicitly) the intrinsics of the inode (there is a lot - filesystem tuning, links, rights management, ...) and concentrate on the timestamps of a file. Always keep in mind, though, that slight differences might occur across different filesystems. I will mostly forego these differences to give you an overall picture first. You may tell us which filesystem you have so we can add details to this picture in a follow-up:

There are three timestamps which are common throughout all filesystems:

  • Modify - the last time the files data was modified
    This basically covers creation as well as modification. Some filesystems, as i said, have a separate creation time, but - as i learned, thanks to RudiC - most have not. This timestamp is also referred to as "mtime".
  • Access - the last time the file was read
    This is set whenever the files data where accessed. Notice that a command like ls (list directory, the analogon to dir in Windows) will not access the files data, but only its meta data. Every access to the data, though, includes an access to the meta data because even the location - where the data can be found - is part of the meta data. This is also referred to as "atime".
  • Change - the last time meta data of the file was changed
    From what i said it should be obvious by now that the meta data can be changed without changing the data itself. What such an operation might look like, though, is perhaps more difficult to envision, so here is an example: the access rights to the file are also stored in the inode. Therefore, when you change the access rights only the meta data are changed, not the data itself. This will reflect in only this timestamp being changed. This timestamp is also called "ctime".

Thanks again to RudiC and drl for their valuable input in this.

I hope this helps.

bakunin