Before I delete any file in Unix, How can I check no open file handle is pointing to that file?

kchinnam · October 5, 2010, 11:48am

I know how to check if any file has a unix process using a file by looking at 'lsof <fullpath/filename>' command.

I think using lsof is very expensive. Also to make it accurate we need to inlcude fullpath of the file.

Is there another command that can tell if a file has a truely active open file handle pointing to it ? or if a file has a runtime PID associated with it?

DGPickett · October 5, 2010, 11:58am

fuser, which is more commonly available than lsof, does this, putting the pids on stdout and the file names and flags on stderr, so you can get the pid list easily. For example:

$ for p in $( fuser . 2>/dev/null )
do
 echo
 ps -fp $p
 echo
 ptree $p
done
$

Corona688 · October 5, 2010, 12:08pm

Nothing bad happens if you delete a file in use. It just persists on disk until the last thing using it quits, then is deleted for good.

There's no faster way to query the kernel about open files, than querying the kernel about open files. fuser works roughly the same way as lsof, so isn't really a slimmer/better workaround.

DGPickett · October 5, 2010, 12:16pm

If you want to delete a file and not have lost space if some program holds it forever, truncate it first. However, if the program appends with a high file position, all the space is reallocated and written to zeros, so there is some virtue in ensuring nobody has it open to write before you delete it.

kchinnam · October 5, 2010, 12:27pm

Thanks for pointing me to "fuser" command.

Does someone have a solaris equivalent of this script?

achenle · October 5, 2010, 12:34pm

Yes - it's "fuser".

And FWIW, assuming you're running on Solaris (as this is posted under the Solaris topic...), fuser on Solaris is a lot faster than lsof on Linux. The last time I looked at the lsof source code it was doing nothing more than searching through the entire /proc file system looking for a match to the file(s) it was given. The Solaris implementation of fuser operates fully inside the kernel (see /usr/include/sys/utssys.h on a Solaris box for the actual system call that does the work) and it seems to me that the Solaris kernel pretty much has to have a direct way of finding the processes that have that file open since it returns so fast - it seems way too fast to be searching unless what it's searching is really small.

kchinnam · October 5, 2010, 12:45pm

My observation is different. If we delete "catalina.out" from a tomcat web server logs folder. Tomcat would not create another log,, until we restart tomcat. Where as other applications do create a new log file.

I am trying to find a safe way to do a generic log file rotation including logs like "catalina.out". Also try to know if a file is being actively used or not to make a better decision before log/file rotation.

When I query using lsof I get this -->

What is the meaning of FD --> 1w 2w ?

What is the difference between 1w, 2w(in lsof output) Vs <PID>o (o in fuser output) ?

DGPickett · October 5, 2010, 12:56pm

Getting an app that does not like to open and close the log to move to a new file is impossible. Many servers are set up to close all logs and reread all configs when you signal them with SIGINT. Others assume they can write forever, being written by HS kids!

You might be able to put a named pipe there, and write your own process to read that named pipe and change logs periodically. Some apps may not accept a named pipe. If you put two servers under a load sharing and relibility tool, you can take down one to reset the log and when it is back up, recycle the other.

jlliagre · October 5, 2010, 1:39pm

This is slightly incorrect, at least when using UFS or ZFS. If the writing program is seeking to the previous end of the file location, the space from the beginning of the file to that new location isn't filled with zeroes but unallocated, leading to the creation of a sparse file. If you read the unallocated data, the OS will return "virtual" zeroes but they use no space at all on the disk.

DGPickett · October 5, 2010, 2:54pm

I was taken in by the zeros! You can close it, and there is no space allocated? I guess it is all done in the allocation storage. It is an interesting resource in the 64 bit world, as you could have a very huge, sparse file with just a few blocks. Consider how handy that is in conjunction with mmap64().

jlliagre · October 5, 2010, 5:01pm

This space isn't allocated whether the file is closed or not.

A simpler approach is to mmap /dev/zero which is kind of an infinite sparse file by design.

DGPickett · October 6, 2010, 2:05pm

The idea is that you could has a key to calculate an address in a virtually huge and actually small file of unallocated blocks and write or fetch a piece of data there, as long as you accept nulls as a miss on a fetch. You can write a sparse file. Pages are allocated only as they are written into the sparse matrix. Current hash maps must drop the key to the modulus of the bucket count and use an array of pointers to indirectly find the bucket data, if the bucket has been created. Here, you can drop the hash to 8 bytes and seek directly to your information within the sparse file. With a mmap64, the seek is just reach out and touch it! Of course, your key generation must leave space for the bucket between possible 64 bit addresses. I guess if the bucket is occupied, you might permute the address in some standard way to pick a new bucket address in a place not likely occupied. Think stock symbols, 1-5 bytes uppercase in a very time dependent industry, like an elecronic order book and matching engine, for instance.

Corona688 · October 6, 2010, 2:30pm

Now that we know what you actually want we can give you better advice.

You shouldn't be handling logfiles this way if you can possibly avoid it. Many daemons let you safely truncate the logfile: : > /path/to/logfile ...and the daemon will start writing from the beginning again. Others will recreate the file when deleted. Others want SIGHUP sent to the daemon process to tell it to truncate or rotate its own logfiles.

If the daemon is so badly written as to force you to hack things with fuser instead of these methods you should get it fixed or replaced.