Start /SYS on SUN SPARC does not start machine [SUN SPARC ENTERPRISE T-5240]

-> start /SYS

Are you sure you want to start /SYS (y/n)? y
Starting /SYS

]-> show HOST

 /HOST
    Targets:
        bootmode
        diag
        domain

    Properties:
        autorestart = reset
        autorunonerror = false
        bootfailrecovery = poweroff
        bootrestart = none
        boottimeout = 0
        hypervisor_version = Hypervisor 1.7.4.a 2009/09/21 08:25
        macaddress = 00:21:28:76:4d:cc
        maxbootfail = 3
        obp_version = OBP 4.30.4 2009/08/19 07:25
        post_version = POST 4.30.4 2009/08/19 07:50
        send_break_action = (none)
        status = Powered off
        sysfw_version = Sun System Firmware 7.2.4.e 2009/09/21 09:50

    Commands:
        cd
        set
        show

FUrthermore, below command hangs the machine

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type #.

Hi,

What user account are you using to logon th the Service Processor? Also you should remember that it can take some time with these servers to see any output.

Regards

Gull04

When I click on Power button of machine it loads and drop me on a prompt where I supply a root user name and password

OK,

Then you can try the following steps.

-> set /SYS keyswitch_state=diag
-> start /SYS
-> start /SP/console

Please note any error messages.

After a while the system should come to the # ok {} prompt.

You should then drop back to the Service Processor use #. end runn the following command.

> show /SP/faultmgmt -level all

Post any output here and please use the code tags to format it correctly.

Regards

Gull04

-> set /SYS keyswitch_state=diag
Set 'keyswitch_state' to 'diag'

-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type #.

Serial console stopped.

-> show /SP/faultmgmt -level all

  /SP/faultmgmt
    Targets:

    Properties:

    Commands:
        cd
        show

Hi,

Can you post the output from a;

-> stop /SYS

Wait a couple of minutes and then;

-> start /SYS
-> start /SP/console

And show any output - Please use the code tags - these are usec by clicking on the </> symbol above this window. This is a forum rule and you must use them to help with readability of your posts.

Regards

Gull04

-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS 



-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type #.

The console command appears to 'hang the machine' as you say because the main chassis is powered off. Only the service processor is running.

At some point you need to give the service processor the command:

power on

or on other Sparc models:

poweron

before the main chassis will light up. There's no mystery about that but, right now, I can't figure out what boot stage you are at.

Hi,

OK, there may be no option here but to go for the nuclear option on this, so you can try the following;

-> stop -force /SYS

Failing that you could try;

-> stop -force -script /SYS

Please note that this will force a complete stop of the system, then you can run;

-> start /SYS
-> start /SP/console

Show any output, this should go through the POST for the server so there may be quite a bit of output.

Regards

Gull04

--- Post updated at 11:22 AM ---

Hi Folks,

I'm maybe being a bit naive here, in assuming the server was powered on I just assumed that the server had been running and wouldn't restart.

If the machine has been forced off from the front panel you will as hicksd8 says have to issue a power on command.

Regards

Gull04

-> power on
Invalid command 'power' - type help for a list of commands.

-> poweron
Invalid command 'poweron' - type help for a list of commands.

Hi,

You may have to set the "bootfailrecovery" to "Powercycle" and increase the "maxbootfail" from "3".

Also you could go for the simple approach and start the "console" and press the power button on the front panel if you can get to it (It's recessed so you need to use a Biro to reach it).

Regards

Gull04

-> show HOST

 /HOST
    Targets:
        bootmode
        diag
        domain

    Properties:
        autorestart = reset
        autorunonerror = false
        bootfailrecovery = Powercycle
        bootrestart = none
        boottimeout = 0
        hypervisor_version = Hypervisor 1.7.4.a 2009/09/21 08:25
        macaddress = 00:21:28:76:4d:cc
        maxbootfail = 5
        obp_version = OBP 4.30.4 2009/08/19 07:25
        post_version = POST 4.30.4 2009/08/19 07:50
        send_break_action = (none)
        status = Powered off
        sysfw_version = Sun System Firmware 7.2.4.e 2009/09/21 09:50

    Commands:
        cd
        set
        show

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type #.
\

1- I have set the "bootfailrecovery" to "Powercycle" and increase the "maxbootfail" from "5".

2- ran the command start /SP/console
3- Clicked on power button after step-2
But no success

Hi,

It might be worth waiting for a while at the end of the sequence in my previous post.

Can you post the output of;

show /SYS

You could also try a;

reset /SYS

Lastly you could possibly try;

-> start -script /SYS

This should force a powercycle if the reset command doesn't work, I'm not sure if the -force switch works with the start and I don't have any boxes that are that old to try it on.

Regards

Gull04

@gull04

-> show /SYS

/SYS
Targets:
SERVICE
LOCATE
ACT
PS_FAULT
TEMP_FAULT
FAN_FAULT
MB
HDD0
HDD1
HDD2
HDD3
HDD4
HDD5
HDD6
HDD7
PDB
PADCRD
SASBP
DVD
TTYA
USBBD
FANBD0
FANBD1
VPS

Properties:
    type = Host System
    ipmi_name = /SYS
    keyswitch_state = Diag
    product_name = T5240
    product\_part_number = 602-0000-00
    product\_serial_number = XXXXXXXXXXXXXXXXXXX
    product_manufacturer = SUN MICROSYSTEMS
    fault_state = OK
    power_state = Off

Commands:
    cd
    reset
    set
    show
    start
    stop

-> reset /SYS
Are you sure you want to reset /SYS (y/n)? y
Performing reset on /SYS
Performing reset on /SYS failed
reset: Target already stopped

-> start -script /SYS
Starting /SYS

-> start -script -force /SYS
Starting /SYS

Same status / No success

Hi,

What have you got in the logs, can you post the output of;

show /SP/logs/event/list

Regards

Gull04

Can you hear the main chassis fans running? These would be quite noisey. If not then I suspect that the output you have posted:

is telling the truth and that is the issue. Perhaps you need to re-seat the power plug on the motherboard.

What is the history of this box? Was it running and just stopped? Is it a new install? Or what?

Next time you do a:

-> start /SYS
-> start /SP/console

and get no response from console output can you type CTRL-Q just to ensure that the console serial port is not X-OFF'd.

Just a thought.

I had a discussion with datacenter guy who was supplying me the command outputs which you guys asked me yesterday. Actually the system was running fine and seems suddenly powered off. Now what he(datacenter guy) does before, when I ask any command output.

He run the show faulty

-> show faulty
Target              | Property               | Value
--------------------+------------------------+---------------------------------
/SP/faultmgmt/0     | fru                    | /SYS/MB
/SP/faultmgmt/0/    | sunw-msg-id            | SUN4V-8001-MR
 faults/0           |                        |
/SP/faultmgmt/0/    | uuid                   | 16f520a9-8891-4a85-a4a5-c0fe225f
 faults/0           |                        | b36e
/SP/faultmgmt/0/    | timestamp              | Oct 16 19:13:16
 faults/0      
-> show faulty
Target              | Property               | Value
--------------------+------------------------+---------------------------------

then he run

set /SYS/MB clear_fault_action=true

then he run other commands like

start /SYS
start /SP/console

Hi,

Do you see any post output from;

-> start /SYS
-> start /HOST/consol

You should see at the very least the ok prompt.

It may be that you have to set the keyswitch to the diagnostic setting and connect to the console and try to capture the post (Power On Self Test) output.

Regards

Gull04

Please find below POST output



-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS

-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS

-> start /HOST/console
start: Invalid target /HOST/console

-> set /SYS keyswitch_state=diag
Set 'keyswitch_state' to 'diag'

->

U-Boot 1.1.1

custom Sun Microsystems U-Boot 1.3 (Sep 21 2009 - 08:40:30) r48331

CPU:   MPC885ZPnn at 133 MHz: 8 kB I-Cache 8 kB D-Cache FEC present
Board: SPARC885
       Watchdog enabled
I2C:   ready
DRAM:
trying 128 MBytes
(128 MB SDRAM) 128 MB
Memory Tests: DA A1 A2 00

U-Boot 1.1.1

custom Sun Microsystems U-Boot 1.3 (Sep 21 2009 - 08:40:30) r48331

CPU:   MPC885ZPnn at 133 MHz: 8 kB I-Cache 8 kB D-Cache FEC present
Board: SPARC885
       Watchdog enabled
I2C:   ready
DRAM:
trying 128 MBytes
(128 MB SDRAM) 128 MB
FLASH: 32 MB
In:    serial
Out:   serial
Err:   serial
Net:   FEC ETHERNET
POST i2c  c  d 18 20 23 2a 2b 2d 2e 30 40 42 43 44 45 46 51 53 54 56 68 6a 6b 70 71 72 73 PASSED
POST cpu PASSED
POST ethernet PASSED
Booting linux in 5 seconds...
## Booting image at fe080000 ...
   Image Name:   Linux-2.4.22
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:    814430 Bytes = 795.3 kB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
do_bootm_linux():
  argv[0]=bootm
  argv[1]=0xfe080000
## Current stack ends at 0x07D38B60 => set upper limit to 0x00800000
No initrd
## cmdline at 0x007FFF00 ... 0x007FFF80
memstart    = 0x00000000
memsize     = 0x08000000
flashstart  = 0xFE000000
flashsize   = 0x02000000
flashoffset = 0x0004C000
sramstart   = 0x00000000
sramsize    = 0x00000000
EnvAddr     = 0x7C7D9B78
EnvSize     = 0xB3E9002C
banksize    = 0x02000000
banks       = 0x00000001
bankwidth   = 0x00000001
sectorsize  = 0x00020000
sectorcount = 0x00000100
booted      = 0xFE080000
boottype    = 0x00000000
primary     = 0xFFFFFFFF
pritype     = 0x00000000
secondary   = 0xFFFFFFFF
sectype     = 0x00000000
image0      = 0xFE000000
image1      = 0xFF000000
maximage    = 0x01000000
immr_base   = 0xF0000000
bootflags   = 0x00000001
intfreq     =    133 MHz
busfreq     = 66.500 MHz
ethaddr     = 00:21:28:76:4D:D5
IP addr     = 0.0.0.0
baudrate    =   9600 bps
## Transferring control to Linux (at address 00000000) ...
## parameters(007ffe80,00000000,00000000,007fff00,007fff80)
Linux version 2.4.22 (cb75630@sanpen-rh5-1) (gcc version 3.3.4) #2 Mon Sep 21 08:36:07 PDT 2009 r48532
On node 0 totalpages: 32768
zone(0): 32768 pages.
zone(1): 0 pages.
zone(2): 0 pages.
Kernel command line: root=/dev/mtdblock4 rootfstype=squashfs ro mtdparts=phys:384K(u1),128k(e1),1536K(k1),14M(r1),384K(u2),128K(e2),1536K(k2),14M(r2)
Decrementer Frequency = 498750000/60
m8xx_wdt: active wdt found (SWTC: 0xFFFF, SWP: 0x1)
m8xx_wdt: keep-alive trigger installed (PITC: 0x1000)
Calibrating delay loop... 132.71 BogoMIPS
Memory: 127624k available (1412k kernel code, 416k data, 68k init, 0k highmem)
Dentry cache hash table entries: 16384 (order: 5, 131072 bytes)
Inode cache hash table entries: 8192 (order: 4, 65536 bytes)
Mount cache hash table entries: 512 (order: 0, 4096 bytes)
Buffer cache hash table entries: 8192 (order: 3, 32768 bytes)
Page-cache hash table entries: 32768 (order: 5, 131072 bytes)
POSIX conformance testing by UNIFIX
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
Starting kswapd
Journalled Block Device driver loaded
squashfs: version 3.0 (2006/03/15) Phillip Lougher
JFFS2 version 2.1. (C) 2001 Red Hat, Inc., designed by Axis Communications AB.
CPM UART driver version 0.03
ttyS00 at 0x0100 is apty: 256 Unix98 ptys configured
Generic RTC Driver v1.07
eth0: FEC ENET Version 0.2, FEC irq 3, MII irq 6, addr 00:21:28:76:4d:d5
RAMDISK driver initialized: 16 RAM disks of 18432K size 1024 blocksize
eth0: Phy @ 0x0, type DM9161 (0x0181b881)
loop: loaded (max 8 devices)
physmap flash device: 2000000 at fe000000
 Amd/Fujitsu Extended Query Table v1.3 at 0x0040
number of CFI chips: 1
Using command line partition definition
Creating 8 MTD partitions on "Physically mapped flash":
mtdblock1: 0x00000000-0x00060000 : "u1"
mtdblock2: 0x00060000-0x00080000 : "e1"
mtdblock3: 0x00080000-0x00200000 : "k1"
mtdblock4: 0x00200000-0x01000000 : "r1"
mtdblock5: 0x01000000-0x01060000 : "u2"
mtdblock6: 0x01060000-0x01080000 : "e2"
mtdblock7: 0x01080000-0x01200000 : "k2"
mtdblock8: 0x01200000-0x02000000 : "r2"
i2c-core.o: i2c core module version 2.6.1 (20010830)
i2c-dev.o: i2c /dev entries driver module version 2.6.1 (20010830)
i2c-rpx.o: i2c MPC8xx module version 2.6.1 (20010830)
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 1024 buckets, 8Kbytes
TCP: Hash tables configured (established 8192 bind 16384)
ip_conntrack version 2.1 (1024 buckets, 8192 max) - 292 bytes per conntrack
ip_tables: (C) 2000-2002 Netfilter core team
ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>.  http://snowman.net/projects/ipt_recent/
arp_tables: (C) 2002 David S. Miller
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
VFS: Mounted root (squashfs filesystem) readonly.
Freeing unused kernel memory: 68k init
modprobe: modprobe: Can't open dependencies file /lib/modules/2.4.22/modules.dep (No such file or directory)
Creating /var tmpfs
Creating directories in /var...done.
Creating /var/log tmpfs...done.
Activating swap.
Calculating module dependencies... done.
Loading modules...
    fpga
Warning: loading /lib/modules/2.4.22/misc/fpga/fpga.o will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Platform ID      : 2
Minor revision   : 4
Major revision   : 5
FPGA power status : 0xd20008: 80 0xd20009: f4
  (SYS_POK_EN  !DC_POK  !POK_VMEM_CPU1  !POK_VMEM_CPU0  !POK_CORE_CPU1  !POK_CORE_CPU0  !POK_PSU1  !POK_PSU0  POK_VMEMWING_CPU1  !VMEMWING_CPU1_PRESENT  POK_VMEMWING_CPU0  !VMEMWING_CPU0_PRESENT  !POK_OIO )
BR2: 0xf2000000/0x1000000/0xf2000401/0xff0001fc
FPGA init OK: base f2000000, size 1000000 (major 120)
Module fpga loaded, with warnings
    fpga_flash
Warning: loading /lib/modules/2.4.22/misc/fpga_flash/fpga_flash.o will taint the kernel: no license
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Module fpga_flash loaded, with warnings
    immap
Warning: loading /lib/modules/2.4.22/misc/immap/immap.o will taint the kernel: no license
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Module immap loaded, with warnings
All modules loaded.
Checking all file systems...
fsck (busybox 1.9.0, 2009-09-21 08:40:59 PDT)
Setting kernel variables ...
kernel.core_pattern = /coredump/%h.%e.core
kernel.core_uses_pid = 1
net.ipv4.ip_local_port_range = 3100 7075
... done.
Mounting local filesystems...
Identifying DOC Device Type(G3/G4/H3) ...



Warning: /lib/modules/2.4.22/misc/tffs/tffs_h3.o symbol for parameter prio not found
Warning: loading /lib/modules/2.4.22/misc/tffs/tffs_h3.o will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
tffs: TrueFFS driver 7100.76
tffs: will not use IRQ
tffs: Looking for G4/P4 MDOC Devices while assuming IF_CFG=16.
tffs: Looking for G4/P4 MDOC Devices while assuming IF_CFG=8.
tffs: Flow: G4_docWindowBaseAddress Exit (NOT FOUND).
tffs: Looking for G3/P3 MDOC Devices while assuming IF_CFG=16.
tffs: Looking for G3/P3 MDOC Devices while assuming IF_CFG=8.
tffs: Looking for H1 DOC at address 0xcd14e000
tffs: DOCH found
tffs: Socket 0 in addr 0xf4000000
tffs: Device 0x0: size 0x1db00000 HW sector 0x200 (recommended 0x1000)
tffs: use major device number 100
Partition check:
 tffsa: tffsa1 tffsa2 tffsa3
tffs:     disk partition: dev_number=0x6401, 65534 sectors, start_sector=1
tffs:     disk partition: dev_number=0x6402, 65536 sectors, start_sector=65535
tffs:     disk partition: dev_number=0x6403, 112000 sectors, start_sector=131071
Module tffs_h3 loaded, with warnings
Loaded TFFS kernel module /lib/modules/2.4.22/misc/tffs/tffs_h3.o
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS 2.4-0.9.19, 19 August 2002 on tffs(100,1), internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS 2.4-0.9.19, 19 August 2002 on loop(7,0), internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS 2.4-0.9.19, 19 August 2002 on loop(7,1), internal journal
EXT3-fs: loop(7,1): 4 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS 2.4-0.9.19, 19 August 2002 on loop(7,2), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS 2.4-0.9.19, 19 August 2002 on tffs(100,3), internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
Checking DOC size...
Mounted all disk partitions.

Checking configuration files state ...
Image date:  Mon Sep 21 09:58:34 PDT 2009       Image revision: 48331
Conf date:   Mon Sep 21 09:58:34 PDT 2009       Conf revision:  48331
Configuration files state good after upgrade.
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
Loaded IFC configuration file.

FRU probing time 7 sec
Created links for vbsc
Running ldconfig...done
Hostname: rdvcagz19.

Setting the System Clock using the Hardware Clock as reference...
System Clock set. Local time: Wed Oct 16 23:21:07 GMT 2019

Setting up networking...done.
Setting up IP spoofing protection: rp_filter.
Configuring network interfaces...eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
done.
Starting portmap daemon: portmap.
Initializing random number generator...done.
Starting vbsc daemon:   Done
Starting BBR daemon...
/dev/fpga open OK
bbrd started after 2 seconds.
Incrementing bootcount ... done (86)
populating memstore vars from disk
INIT: Entering runlevel: 3

rdvcagz19 login:


Hi,
This looks a bit strange to me, can you login and show the output of;

uname -a

This is a very old Linux Kernel that is running here and the output seems to be a little strange for a T5240.

Regards

Gull04