Accidently deleted /usr contents.

Hi

I have the following code which was supposed to clean up a directory when the number of files in that directory exeeded 2.The code is given below.

 
#!/usr/bin/ksh
dir_num=`/usr/bin/find /var/.audit -type d | /usr/bin/wc -l`
if [ $dir_num -gt 2 ]
then
 oldest_file=`/usr/bin/ls -1t | /usr/bin/tail -1`
 /usr/bin/rm -rf $oldest_file
fi

When executed the above code something bad happened.Apparantly most of my files in

/usr

directory got deleted.Can anyone tell me how it happened?

I have the modified code:

 
 
#!/usr/bin/ksh
dir_num=`/usr/bin/find /var/.audit -type d | /usr/bin/wc -l`
if [ $dir_num -gt 2 ]
then
 oldest_file=`/usr/bin/ls -1t /var/.audit/ | /usr/bin/tail -1`
 /usr/bin/rm -rf /var/.audit/$oldest_file
fi

Is there any way to recover those file?

Thanks in advance...

Yes, you can recover from your system backup.

You see,I don't have a backup.What I'm doing now is copying all those files from another server with the same configuration.

Is there any other way?

---------- Post updated at 09:03 AM ---------- Previous update was at 09:00 AM ----------

What I really want to know is how it happened.How did all those files got deleted in ther first place?

Yikes.

Difficult to say without knowing what find found, but a path name with spaces in it may cause unintended splitting whenever unquoted. /path/with /usr/ would split into '/path/with' and '/usr/' for instance.

Why this script was running as root is another good soul-searching question. Preventing program bugs from getting too out of control is one reason file permissions exist, but root bypasses all that.

You were using find to get a list of all .audit files in /var, but after that, you ran ls in the working directory that may have been pointing to somewhere else (i.e. to /usr?), got the oldest file there and rm ed it with the -force option. Anyhow, it should have removed just one file, unless you ran it repeatedly to remove nearly the entire /usr dir.

Your second code snippet is far safer here...

2 Likes

$oldest_file could also have contained a directory entry. Maybe the script was executed while in the root directory and /usr happened to be the oldest entry..

Well find was intented to return a number;initially 2 and then 3 on the next run,but never more than 3.

And another intresting fact is that the $oldeest_file sholud only have the filename and not any path.
The script was supposed to be automatically executed by the audomon daemon when the audit trail was swiched to another file.So I have no idead fom where the script was run,but I think the script would hav run with a UID of 0 and thus all the problem.

---------- Post updated at 01:41 AM ---------- Previous update was at 01:37 AM ----------

If that was the case, then all files in /usr should have been deleted.But some files were not deleted,only a few.

This is why whenever I run rm -rf from a script, my first run is ALWAYS:

...
#rm -rf $something
echo "we will be removing: $something"
...

i.e. initially comment out the rm and have it echo what we'll be removing instead. If it looks sane, then go ahead and uncomment the rm.

Another lesson here - backups - but you have probably heard enough about that already...

1 Like

In post #1 it says "most of my files in /usr got deleted", not a few. I assumed you perhaps had interrupted the script in the process..

But the script would have ran only a few times,probably less than ten,before I noticed the files missing.So how come more than 300 files were deleted from /usr/bin alone and some directories in /usr were removed altogether.

How many files were left in /usr/bin ? Is it possible that some files were (re-)installed in /usr/bin after a previous removal? Then the script could have been run with the current directory set to /usr , since it was trying to remove the oldest entry, even directory trees not just files...

I had nothing to with the execution of the script,it was executed by audomon daemon.

I don't think it would have reinstalled anything,because I don't think any provsion exists to do so.
Only bin,lbin,sbin and lib were present in /usr after deletion took place and most files were missing from them too.In /usr/bin I think only 4 files were left.I think they were view,remsh and other two startin with 'r'.

Anyone who runs a script as root which deletes files is asking for trouble.

So, when the OP asks "what went wrong?", the answer is pretty straight forward.

The OP ran a script that removes files (as the superuser!) without testing the script first. It's that simple.

How do you test it?

Well, when the script runs, instead of removing the files, you simply write the files that would be deleted to a file. Then, you examine the file with the output of "files and directories that would have been deleted" and make sure the script does what you want. If all is OK, then enable the script to actually work (delete the files).

Never, ever, run a script that deletes files (especially as root) without testing first and confirming the script is working as intended.

Edit: Note that ZB recommends a similar approach in his reply above.

I was not suggesting that the script might have reinstalled anything, but enquiring about the possibility that somebody else of some other process could have been installing or reinstalling something in /usr/bin after the script had previously removed the directory (perhaps even as a response to it, for example some form of configuration management)

Well learned that lesson the hard way.

NEVER DELETE ANYTHING AS SUPERUSER WITHOUT TESTING IT OUT FIRST!!

And thanks for the tip.

Yes, if you follow that rule, you will have a much happier life as a system admin.

Also, don't forget an equally important rule:

NEVER WORK ON PRODUCTION (CRITICAL) SYSTEMS AS SUPERUSER WITHOUT A FULL, CURRENT OFF-SERVER BACKUP IN PLACE.

1 Like

I'm not sure is any such provison exists.

Those files may have not been deleted because they were in use by some other process.Thats's the conclusion that I have reached.

The script went ahead and deleted every in /usr as it selected that as the oldest file as you said earlier.Some files were not deleted because they were in use.

So some files were left behind and most files got deleted.

So maybe the rm command stopped when it tried to remove itself from /usr/bin or stopped working correctly after it removed itself from /usr/bin...

You said three files starting with "r" were left in /usr/bin; remsh was one of them but you didn't say whether rm was one of them.

Just another tip as you probably have heard the "have backups and test thoroughly when working as root" probably often enough now to hit home:

If your "/usr/bin/ls" binary is already missing you can emulate it by using

echo *

and use the wildcard expansion of the shell instead as a makeshift-ls.

I hope this helps.

bakunin

Well it may be as you you said,but i forgot wat the other two files were.I just remember that it started with 'r'.

Thanks for the the tip.But I already copied most of the lost files from other server with same configuration and it seems to be working.