Handling inode on solaris 9

Dear all,
yesterday I had a big problem on Solaris 9. I cannot write anymore on /var. I checked the inode usage, and I see that for /var was at 100% with ifree = 0.
I deleted some unused files (like old log on /var/tmp and /var/log), now I have ifree=19641 and 99% iused:

root@ciy01 # df -F ufs -o i
Filesystem             iused   ifree  %iused  Mounted on
/dev/md/dsk/d10        11668  237100     5%   /
/dev/md/dsk/d50        69916  802404     8%   /usr
/dev/md/dsk/d20      2182215   19641    99%   /var
/dev/md/dsk/d40       157429 4348555     3%   /opt
/dev/md/dsk/d60           94  132962     0%   /export/home
/dev/md/dsk/d70           95 8476449     0%   /opt/oradata
root@root@ciy01 # df -g /var
/var               (/dev/md/dsk/d20   ):         8192 block size          1024 frag size  
36661588 total blocks   30028460 free blocks 29661846 available        2201856 total files
   19182 free files     22282260 filesys id  
     ufs fstype       0x00000004 flag             255 filename length

Honestly I don't see any other files to delete on /var (there's a ton of files, but all needed). What I can do now?
This is a production machine, so I have to be careful, and I cannot add any other disks, so, is there a way to restore/free other inode in /var

btw, this is what I seen on /var:

root@ciy01 var# ls -laibr
total 112
     14131 drwxr-xr-x   3 root     bin          512 Mar 18  2008 yp
     14142 drwxr-xr-x   7 uucp     uucp         512 Mar 18  2008 uucp
     12824 drwxrwxrwt   3 root     sys          512 Jun 29 17:12 tmp
        11 -rwxrwxrwx   1 root     root         387 Mar  3  2010 test.sh
     14166 drwxr-xr-x   4 daemon   daemon       512 Mar 18  2008 statmon
     12789 drwxr-xr-x  12 root     bin          512 Mar 18  2008 spool
     12849 drwxr-xr-x   3 root     sys          512 Mar 18  2008 snmp
     14152 drwxr-xr-x   4 root     bin          512 Mar 18  2008 samba
     12785 drwxr-xr-x   3 root     bin          512 Mar 18  2008 saf
         4 drwxr-xr-x  12 root     sys          512 May 21  2009 sadm
     12784 drwxr-xr-x   7 root     sys          866 Jun 29 02:06 run
     30082 drwxr-xr-x   3 root     root        1536 Jun 29 07:00 reports
     12783 drwxrwxrwt   3 root     bin          512 Mar 31  2008 preserve
      8833 drwxr-xr-x   6 root     sys          512 Mar 25  2008 opt
     14140 drwxr-xr-x   3 root     sys          512 Mar 18  2008 ntp
     14129 drwxr-xr-x   2 root     sys          512 Mar 18  2008 nis
     12827 drwxr-xr-x   2 root     bin          512 Jun 22  2005 nfs
      8832 drwxr-xr-x   2 root     bin          512 Jun 22  2005 news
     22241 drwx------   2 root     root         512 Jun 29 02:50 net-snmp
      8829 drwxrwxrwt   3 root     mail         512 Apr  1  2008 mail
     14126 drwxrwxr-x   3 lp       lp           512 Mar 18  2008 lp
         3 drwx------   2 root     root        8192 Jun 22  2005 lost+found
      8822 drwxr-xr-x   2 root     sys          512 Mar 22  2011 log
     14128 drwxr-xr-x   2 root     sys          512 Jun 22  2005 ldap
      8818 drwxr-xr-x   3 root     bin          512 Mar 18  2008 ld
     12826 drwxr-xr-x   2 root     sys          512 Jun 22  2005 krb5
      8817 drwxr-xr-x   2 root     sys          512 Jun 22  2005 inet
     14138 drwxr-xr-x   3 root     sys          512 Mar 18  2008 imq
     14172 drwxrwxrwt   2 bin      bin          512 Jul 29  2005 home
     14155 drwxr-xr-x   4 root     other        512 Mar 18  2008 gnome
     96297 drwxrwxrwx   4 root     root         512 Jun 29 17:12 getxms
     12828 drwxr-xr-x   6 root     root         512 Jun 29 02:04 dt
     12841 drwxr-xr-x   5 root     sys          512 Mar 18  2008 dmi
      8816 drwxr-xr-x   2 root     sys          512 Nov  2  2009 cron
     14170 drwxr-xr-x   4 root     root         512 Mar 18  2008 crash
   1518456 drwxrwxrwx   2 root     root         512 Jun 29 17:02 checkxms
     12855 drwxr-xr-x   9 root     bin          512 Mar 18  2008 apache
      8070 drwxrwxr-x  10 root     sys         1024 Jun 29 02:56 adm
         2 drwxr-xr-x  32 root     root        1536 Jun 29 17:12 ..
         2 drwxr-xr-x  39 root     sys         1024 Jun 29 11:31 .

Please help, I worked during all the last night to try to solve this issue!!!! :wall:

---------- Post updated at 11:26 AM ---------- Previous update was at 10:16 AM ----------

=========================================================

Well, I found tons of file under /var/spool/clientmqueue/
If I'm not wrong this is output from mail. I believe this directory was not clear from long time (at least 7/8 years).

The problem now, is that I cannot delete files inside it:

root@ciy01 clientmqueue# find /var/spool/clientmqueue/* -exec rm -f {} \;
Segmentation Fault - core dumped

or

# find /var/spool/clientmqueue/* -exec rm -f {} \;
/usr/bin/find: arg list too long

Could someone help me?

Try:

find /var/spool/clientmqueue -exec rm -f {} \;

Thanks, I'm using the same command, except for "type" clause:

find /var/spool/clientmqueue -type f -exec rm {} \;

It's taking so long time, and I was curios how many files are on clientmqueue directory! :rolleyes:

BTW, seem it's working:

Filesystem             iused   ifree  %iused  Mounted on
/dev/md/dsk/d20      1900704  301152    86%   /var

...seem it was deleted 300000 file so far :eek:

If you do not want the empty directories (if any) lying around consuming significant amount of inodes, you may want to execute the following too:

find /var/spool/clientmqueue -depth -type d -exec rmdir '{}' \;

This would have been faster:

find /var/spool/clientmqueue -type f -exec rm -f {} +
1 Like

Does find on Solaris support + ? I thought that was a GNU-only feature.

That find functionality first appeared in SVR4 (released late 1988/early 1989). Solaris does indeed support it. HP-UX has had it for a long time as well. All of the BSD finds support it. POSIX included it in their 2004 specification.

It's discussed at the end of the rationale section and mentioned in the change history of find

Regards,
Alister

1 Like

It was actually first implemented by David Korn in 1987 and was included in SVR4.0 / Solaris 2.0 which was jointly developed by AT&T and Sun Microsystems.

Gnu implemented it later and before that, was only supporting Dan Berstein's -print0 | xargs -0 alternative.

1 Like

Thanks, I got finally the total ammount on files into the clientmqueue directory. There are 1.800.000 files :eek: ... And deleting them taking a lot of time (my command runs till now for 9 hours and deleted "only" 1.000.000 files!!!!)

I'm trying your command and I hope it's faster then mine! :wink:

---------- Post updated at 11:55 AM ---------- Previous update was at 02:32 AM ----------

Well, finally I succesfully deleted 1.800.000 files!!!!
But how can I stop the creation of the new one? probably stopping sendmail service (also at boottime)

I tried without success:

# svcs sendmail 
bash: svcs: command not found
# service sendmail status
bash: service: command not found
# svcadm disable sendmail
bash: svcadm: command not found

scvs and svcadm are only implemented in Solaris 10 and newer. Anyway, that's not the proper approach. These mails are not created by the sendmail service but by applications wishing to send mail. The fact they stay there is either caused by an improperly configured or disabled sendmail or, more likely, because their recipient is local and just doesn't read them. Have a look at the mail contents to see what is creating them. A common root cause is cron jobs producing output. A simple workaround is redirecting their stdout and stderr to /dev/null.

1 Like

You're right, the mail was created by some scripts in crontab. How can I redirect to dev/null all the scripts output?

This is what I see on the mails:

root@ciy01 clientmqueue# cat dfq5UM7EWH024848
Your "cron" job on ciy01
/var/opt/getstat.sh

produced the following output:

Local directory now /var/getstat

And also:

root@ciy01 clientmqueue# cat qfq5UM6Dtm024456
V6
T1341093973
K1341093974
N1
P30237
MDeferred: Connection refused by [127.0.0.1]
Fbs
$_root@localhost
${daemon_flags}c u
Sroot
Aroot@ciy01.
MDeferred: Connection refused by [127.0.0.1]
C:root
rRFC822; root@ciy01.
RPFD:root
H?P?Return-Path: <g>
H??Received: (from root@localhost)
        by ciy01. (8.12.10+Sun/8.12.9/Submit) id q5UM6Dtm024456
        for root; Sun, 1 Jul 2012 00:06:13 +0200 (CEST)
H?D?Date: Sun, 1 Jul 2012 00:06:13 +0200 (CEST)
H?F?From: Super-User <root>
H?x?Full-Name: Super-User
H?M?Message-Id: <201206302206.q5UM6Dtm024456@ciy01.>
H??To: root
H??Subject: Output from "cron" command
H??Content-Type: text
H??Content-Length: 115
.

Either add >/dev/null 2>&1 at the end of the lines calling /var/opt/getstat.sh in root's crontab:

EDITOR=vi crontab -e

or add the line exec >/dev/null 2>&1 once in /var/opt/getstat.sh script before it spits its output.

1 Like

Thank you, finally there's no more mail on spool directory. :b: