I have two Solaris 10 T2000 systems.
Platform sun8 has newer firmware than sun7.
sun8/user$ prtdiag -v | grep OBP
OBP 4.30.4.b 2010/07/09 13:48
sun7/user$ prtdiag -v | grep OBP
OBP 4.30.4.a 2010/01/06 14:56
The platform (sun8) with the newer firmware (OBP 4.30.4.b) has a Fault Management service which toggles online/offline repetitively.
sun8/user$ svcs fmd
STATE STIME FMRI
online 14:40:52 svc:/system/fmd:default
sun8/user$ svcs fmd
STATE STIME FMRI
offline* 14:40:55 svc:/system/fmd:default
sun8/user$ svcs fmd
STATE STIME FMRI
online 14:41:01 svc:/system/fmd:default
sun8/user$ svcs fmd
STATE STIME FMRI
offline* 14:41:04 svc:/system/fmd:default
The services which "fmd" are dependent upon are online.
sun8/user$ svcs -d fmd
STATE STIME FMRI
online Feb_02 svc:/system/filesystem/minimal:default
online Feb_02 svc:/system/sysevent:default
online Feb_02 svc:/network/rpc/bind:default
online Feb_02 svc:/system/dumpadm:default
The error log is not useful.
sun8/user$ svcs -xv fmd
svc:/system/fmd:default (Solaris Fault Manager)
State: offline since Thu Feb 03 14:45:27 2011
Reason: Start method is running.
See: http://sun.com/msg/SMF-8000-C4
See: man -M /usr/share/man -s 1M fmd
See: /var/svc/log/system-fmd:default.log
Impact: This service is not running.
sun8/user$ tail /var/svc/log/system-fmd:default.log
[ Feb 3 14:45:09 Executing start method ("/usr/lib/fm/fmd/fmd") ]
[ Feb 3 14:45:15 Method "start" exited with status 0 ]
[ Feb 3 14:45:18 Stopping because all processes in service exited. ]
[ Feb 3 14:45:18 Executing stop method (:kill) ]
[ Feb 3 14:45:18 Executing start method ("/usr/lib/fm/fmd/fmd") ]
[ Feb 3 14:45:24 Method "start" exited with status 0 ]
[ Feb 3 14:45:27 Stopping because all processes in service exited. ]
[ Feb 3 14:45:27 Executing stop method (:kill) ]
[ Feb 3 14:45:27 Executing start method ("/usr/lib/fm/fmd/fmd") ]
[ Feb 3 14:45:33 Method "start" exited with status 0 ]
Steps already taken:
- upgraded firmware (the box had the same problem with the older firmware)
- disabled and enabled the service
- rebooted the box
Google has a lot of people reporting this problem, even the identical problem on OpenSolaris site which indicated it could not be replicated.
What should be looked at, next, to narrow the issue?