Solaris 10 svcs failures

upon rebooting the solaris 10 system, all the services went offilne or uninitialised. If I break the SVM mirror and reboot the system with the raw device, all services are up. Once I recreate a fresh mirror(metadevices) and reboot, it goes offline again. Needed to do svcadm clear <service> to bring those services online. This will happen again if a reboot is done. What is the cause and how to fix it?? :mad:

bash-3.00# svcs -xv
svc:/system/filesystem/local:default (local file system mounts)
 State: maintenance since Wed Nov 18 17:05:09 2009
Reason: Start method exited with $SMF_EXIT_ERR_FATAL.
   See: http://sun.com/msg/SMF-8000-KS
   See: /var/svc/log/system-filesystem-local:default.log
Impact: 39 dependent services are not running:
        svc:/application/psncollector:default
        svc:/system/webconsole:console
        svc:/system/filesystem/autofs:default
        svc:/system/system-log:default
        svc:/application/management/seaport:default
        svc:/application/management/snmpdx:default
        svc:/application/management/dmi:default
        svc:/milestone/multi-user-server:default
        svc:/system/basicreg:default
        svc:/system/zones:default
        svc:/application/management/sma:default
        svc:/system/fpsd:default
        svc:/milestone/multi-user:default
        svc:/application/graphical-login/cde-login:default
        svc:/application/cde-printinfo:default
        svc:/network/smtp:sendmail
        svc:/system/dumpadm:default
        svc:/system/fmd:default
        svc:/network/ssh:default
        svc:/system/sysidtool:net
        svc:/network/rpc/bind:default
        svc:/network/nfs/nlockmgr:default
        svc:/network/nfs/client:default
        svc:/network/nfs/status:default
        svc:/network/nfs/cbd:default
        svc:/network/nfs/mapid:default
        svc:/application/sthwreg:default
        svc:/application/stosreg:default
        svc:/network/inetd:default
        svc:/system/sysidtool:system
        svc:/system/postrun:default
        svc:/system/filesystem/volfs:default
        svc:/platform/sun4u/dscp:default
        svc:/platform/sun4u/sckmd:default
        svc:/system/cron:default
        svc:/application/font/fc-cache:default
        svc:/system/boot-archive-update:default
        svc:/network/shares/group:default
        svc:/system/sac:default

svc:/network/rpc/gss:default (Generic Security Service)
 State: uninitialized since Wed Nov 18 17:05:01 2009
Reason: Restarter svc:/network/inetd:default is not running.
   See: http://sun.com/msg/SMF-8000-5H
   See: man -M /usr/share/man -s 1M gssd
Impact: 19 dependent services are not running:
        svc:/network/nfs/client:default
        svc:/system/filesystem/autofs:default
        svc:/system/webconsole:console
        svc:/system/system-log:default
        svc:/application/management/seaport:default
        svc:/application/management/snmpdx:default
        svc:/application/management/dmi:default
        svc:/milestone/multi-user-server:default
        svc:/system/basicreg:default
        svc:/system/zones:default
        svc:/application/management/sma:default
        svc:/system/fpsd:default
        svc:/milestone/multi-user:default
        svc:/application/graphical-login/cde-login:default
        svc:/application/cde-printinfo:default
        svc:/network/smtp:sendmail
        svc:/system/dumpadm:default
        svc:/system/fmd:default
        svc:/network/ssh:default

svc:/network/rpc/meta:default (SVM remote metaset services)
 State: uninitialized since Wed Nov 18 17:05:01 2009
Reason: Restarter svc:/network/inetd:default is not running.
   See: http://sun.com/msg/SMF-8000-5H
   See: man -M /usr/share/man -s 1M rpc.metad
Impact: 7 dependent services are not running:
        svc:/system/mdmonitor:default
        svc:/milestone/multi-user:default
        svc:/milestone/multi-user-server:default
        svc:/system/basicreg:default
        svc:/system/zones:default
        svc:/application/graphical-login/cde-login:default
        svc:/application/cde-printinfo:default

svc:/network/rpc/rstat:default (kernel statistics server)
 State: uninitialized since Wed Nov 18 17:05:01 2009
Reason: Restarter svc:/network/inetd:default is not running.
   See: http://sun.com/msg/SMF-8000-5H
   See: man -M /usr/share/man -s 1M rpc.rstatd
   See: man -M /usr/share/man -s 1M rstatd
Impact: 1 dependent service is not running:
        svc:/application/management/sma:default

svc:/network/rpc/smserver:default (removable media management)
 State: uninitialized since Wed Nov 18 17:05:02 2009
Reason: Restarter svc:/network/inetd:default is not running.
   See: http://sun.com/msg/SMF-8000-5H
   See: man -M /usr/share/man -s 1M rpc.smserverd
Impact: 1 dependent service is not running:
        svc:/system/filesystem/volfs:default

svc:/platform/sun4u/dcs:default (domain configuration server)
 State: maintenance since Wed Nov 18 17:05:09 2009
Reason: Restarting too quickly.
   See: http://sun.com/msg/SMF-8000-L5
   See: man -M /usr/share/man -s 1M dcs
   See: /var/svc/log/platform-sun4u-dcs:default.log
Impact: This service is not running.
bash-3.00#
bash-3.00# svcadm clear
bash-3.00# svcadm clear svc:/system/filesystem/local:default
bash-3.00# Reading ZFS config: done.
Nov 18 17:13:26 inetd[476]: Property 'user' of instance svc:/network/finger:default is missing, inconsistent or invalid
Nov 18 17:13:26 inetd[476]: Invalid configuration for instance svc:/network/finger:default, placing in maintenance
syslogd: line 22: WARNING: syslogsrvl could not be resolved
syslogd: line 23: WARNING: syslogsrv2 could not be resolved
syslogd: line 24: WARNING: secsrv1 could not be resolved
syslogd: line 25: WARNING: secsrv2 could not be resolved
Nov 18 17:02:34 ITL-X1CIGPX-001 rpc.metad: [ID 702911 daemon.error] Terminated
Nov 18 17:02:34 ITL-X1CIGPX-001 rpcbind: [ID 564983 daemon.error] rpcbind terminating on signal.
Nov 18 17:05:09 ITL-X1CIGPX-001 svc.startd[7]: [ID 748625 daemon.error] system/filesystem/local:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 468488 daemon.error] <293> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 468648 daemon.error] <298> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 589038 daemon.error] <303> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 588975 daemon.error] <311> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 589200 daemon.error] <328> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 589041 daemon.error] <333> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 589201 daemon.error] <338> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 589236 daemon.error] <369> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 dcs: [ID 589077 daemon.error] <374> network initialization failed
Nov 18 17:05:09 ITL-X1CIGPX-001 svc.startd[7]: [ID 748625 daemon.error] platform/sun4u/dcs:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)
Nov 18 17:13:26 ITL-X1CIGPX-001 inetd[476]: [ID 702911 daemon.error] Property 'user' of instance svc:/network/finger:default is missing, inconsistent or invalid
Nov 18 17:13:26 ITL-X1CIGPX-001 inetd[476]: [ID 702911 daemon.error] Invalid configuration for instance svc:/network/finger:default, placing in maintenance
Nov 18 17:13:26 ITL-X1CIGPX-001 metadevadm: [ID 366239 daemon.error] Disk movement detected
Nov 18 17:13:26 ITL-X1CIGPX-001 metadevadm: [ID 947512 daemon.error] Updating device names in Solaris Volume Manager
Nov 18 17:13:28 ITL-X1CIGPX-001 sendmail[683]: My unqualified host name (ITL-X1CIGPX-001) unknown; sleeping for retry
Nov 18 17:13:28 ITL-X1CIGPX-001 sendmail[684]: My unqualified host name (ITL-X1CIGPX-001) unknown; sleeping for retry

bash-3.00#
bash-3.00#
bash-3.00#
bash-3.00#
bash-3.00#
bash-3.00#
bash-3.00#
bash-3.00# svcs -xv
svc:/platform/sun4u/dscp:default (DSCP Service)
 State: offline since Wed Nov 18 17:13:25 2009
Reason: Start method is running.
   See: http://sun.com/msg/SMF-8000-C4
   See: man -M /usr/share/man -s 1M prtdscp
   See: /var/svc/log/platform-sun4u-dscp:default.log
Impact: 2 dependent services are not running:
        svc:/system/fmd:default
        svc:/system/fpsd:default

svc:/application/print/server:default (LP print server)
 State: disabled since Wed Nov 18 17:05:01 2009
Reason: Disabled by an administrator.
   See: http://sun.com/msg/SMF-8000-05
   See: man -M /usr/share/man -s 1M lpsched
Impact: 1 dependent service is not running:
        svc:/application/print/rfc1179:default

svc:/system/webconsole:console (java web console)
 State: offline since Wed Nov 18 17:13:27 2009
Reason: Start method is running.
   See: http://sun.com/msg/SMF-8000-C4
   See: man -M /usr/share/man -s 1M smcwebserver
   See: /var/svc/log/system-webconsole:console.log
Impact: This service is not running.

svc:/network/finger:default (finger)
 State: maintenance since Wed Nov 18 17:13:26 2009
Reason: Restarter svc:/network/inetd:default gave no explanation.
   See: http://sun.com/msg/SMF-8000-9C
   See: man -M /usr/share/man -s 1M in.fingerd
   See: man -M /usr/share/man -s 1M fingerd
Impact: This service is not running.

svc:/platform/sun4u/dcs:default (domain configuration server)
 State: maintenance since Wed Nov 18 17:05:09 2009
Reason: Restarting too quickly.
   See: http://sun.com/msg/SMF-8000-L5
   See: man -M /usr/share/man -s 1M dcs
   See: /var/svc/log/platform-sun4u-dcs:default.log
Impact: This service is not running.
bash-3.00# svcs ssh
STATE          STIME    FMRI
online         17:13:27 svc:/network/ssh:default
bash-3.00# svcs ftp
STATE          STIME    FMRI
online         17:13:26 svc:/network/ftp:default

can you post the /etc/system file? maybe something is missing there. also vfstab with and without metadevices and the metadevice configuration might be usefull...

All the basic checks have been done. Nothing wrong. Im well experienced in SVM, so no worries.. I have 10 other machines of the same type and this doesn't happen. Its not really a problem with the svm portion

ok, i stop to worry...

It's time to migrate to ZFS ...

We are not going to do that for now. We will need to fix this.

DukeNude --> pls worry to solve this. Think you misunderstood my stmt :slight_smile:

Migrating to ZFS would make the problem disappear so would be kind of a fix.
I doubt it can't be something else than an SVM configuration related or mirror corruption issue.

Really no problem with the SVM mirror. I already checked the filesystem, newfs, ufsdump, recreated mirror.. etc etc.. so no more chance for that.

Is your mirror bootable ? Is it SPARC or x86 ?

Its a SPARC M3000. Yes, the mirror does boot up to run level 3 and the login appears. Metadevice state is Okay.

---------- Post updated at 11:24 PM ---------- Previous update was at 10:43 PM ----------

The following is the exact problem I faced. But to let you know, the log file does not show useful info nor I don't see any symptoms of disk failure, (even though I tried with the 2 other spare disks)

bash-3.00# iostat -En |grep Hard
c0t1d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
c0t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
c0t2d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
c0t3d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
c0t4d0           Soft Errors: 8 Hard Errors: 0 Transport Errors: 0
c3t6001438005DE95970000600001EB0000d0 Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
Solaris[TM] 10 Operating System: Local filesystem issue may cause no network services on reboot
 

--------------------------------------------------------------------------------

Symptoms


Systems running Solaris[TM] 10 Operating System (S10) with local filesystem issues during a boot/reboot may cause network services, such as inetd or sshd, to not start.


A bootup/reboot, may see this type of message:

SunOS Release 5.10 Version Generic_118822-25 64-bit
Copyright 1983-2005 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hostname: netlab36
SUNW,eri0 : 100 Mbps full duplex link up
checking ufs filesystems
/dev/rdsk/c1t1d0s0: is logging.
svc:/system/filesystem/local:default: WARNING: /sbin/mountall -l failed: exit status 1
May 26 21:02:42 svc.startd[7]: svc:/system/filesystem/local:default: Method "/lib/svc/method/fs-local" failed with exit status 95.
[ system/filesystem/local:default failed fatally (see 'svcs -x' for details) ]Will need to be connected to system via console or serial port since network services are disabled (headless systems) to see any boot errors.



Resolution


At least four items may be observed during this issue:


1. 'svcs -x' sees issue with: svc:/system/filesystem/local:default (local file system mounts)

2. 'inetd' not running

3. 'sshd' not running

4. multiuser milestone is not completed

If #1 is true, then look at /var/svc/log/system-filesystem-local:default.log and determine why one of the local filesystems cannot be mounted. Some possibilities:

require fsck 
cannot get to filesystem (ie. no longer available) 
bad disk 
cannot get to disk due to bad HBA 

Any help pls :o

what tells

who -r 

and

svcs -a | grep network

and is this SPARC or x86 Solaris

its a sparc. network is uninitialised. runlevel is 3

can you get GUI or stock in CLI , what says

car /etc/system

and did you tried to do a reconfiguration boot from OBP

boot -r 

and if you have GUI and everything is OK , but just network is disabled try

netservices open 

in UNIX CLI

I have tried everything you said. except for netservices open
IS this a solaris command?

yep it is

netservices limited disable all network services and only leaves SSH enabled

http://hub.opensolaris.org/bin/view/Community\+Group\+security/sbd

this command wors in Solris 10

what does the following log file say:
/var/svc/log/system-filesystem-local:default.log

I had the same problem and it was due to zfs mount -a failing. Not that it is the same problem you are having since you said you are not using zfs, but it should show an error in there you are having which may lead you in the right direction.