dumpadm question

Hello!

I am trying to invoke a crash dump on a running Solaris system so I can view the crash log

I'm running:
SunOS my_box 5.10 Generic_138888-02 sun4u sparc SUNW,Sun-Fire-V240

Here's some output to give a better look of my problem

# savecore -L
savecore: dedicated dump device required

# dumpadm
Dump content: kernel pages
Dump device: /dev/dsk/c0t0d0s1 (swap)
Savecore directory: /var/crash/my_box Savecore enabled: yes

I only have 1 disk at the momment, but I heard there was ways to work around this?

I tried following the following URL but that does not work for me (I get the error below)

# dumpadm -c kernel -d /export/home/swap
dumpadm: cannot use /export/home/swap as dump device: Block device required

#3284: How to capture a live system core dump without having a dedicated

I don't know much about zfs, but I read I could mount a zfs block device

# mkfile 4096m /swap-file
# zpool create mypool /swap-file
# zfs create -V 2048m mypool/block
# dumpadm -c kernel -d /dev/zvol/dsk/mypool/block
dump is not supported on device '/dev/zvol/dsk/mypool/block': vdev type 'file' is not supported

Any pointers?

---------- Post updated at 04:59 PM ---------- Previous update was at 04:34 PM ----------

Just got it working!

I just needed to use S2 instead of S1

does anyone know why this works?

# dumpadm -c kernel -d /dev/dsk/c0t0d0s2
Dump content: kernel pages
Dump device: /dev/dsk/c0t0d0s2 (dedicated)
Savecore directory: /var/crash/my_box Savecore enabled: yes

# savecore -L
dumping to /dev/dsk/c0t0d0s2, offset 65536, content: kernel
100% done: 64312 pages dumped, compression ratio 3.48, dump succeeded
System dump time: Sun Apr 19 16:55:53 2009
Constructing namelist /var/crash/my_box/unix.0
Constructing corefile /var/crash/my_box/vmcore.0
100% done: 64312 of 64312 pages saved

---------- Post updated at 05:16 PM ---------- Previous update was at 04:59 PM ----------

Maybe this should be added to another thread, but I thought this was related!

Well now that I have the crash dump, I want to view it!!

I know I can use the mdb command, but I was trying to go through that man page and I was getting lost!

Is there an alternative? Can anyone give me some simple dcmds so I can do SOMETHING wih the crash dumps?

Any advice is greatly appreciated!

slice 2 is the whole disk, you risk wiping your operating system.

All you have to do is use a slice that is not a swapfs slice, e.g. /var/tmp or make a directory /export/home/dump.

you should never use slice 2. btw, what was the size of your swap space assigned?
can you cat /etc/dumpadm.conf ?

I hope you have an EFI label on that disk, or at least one with a non-standard Sun layout. Because the normal Sun partitioning scheme uses overlapping partitions where, as other have noted, slice 2 overlaps everything on the disk.

If you used slice 2 and the disk is partitioned something like this:

format> verify

Primary label contents:

Volume name = <        >
ascii name  = <DEFAULT cyl 4424 alt 2 hd 255 sec 63>
pcyl        = 4426
ncyl        = 4424
acyl        =    2
bcyl        =    0
nhead       =  255
nsect       =   63
Part      Tag    Flag     Cylinders        Size            Blocks
  0       root    wm    1047 - 4423       25.87GB    (3377/0/0) 54251505
  1       swap    wu       1 -  523        4.01GB    (523/0/0)   8401995
  2     backup    wm       0 - 4423       33.89GB    (4424/0/0) 71071560
  3 unassigned    wm     524 - 1046        4.01GB    (523/0/0)   8401995
  4 unassigned    wm       0               0         (0/0/0)           0
  5 unassigned    wm       0               0         (0/0/0)           0
  6 unassigned    wm       0               0         (0/0/0)           0
  7 unassigned    wm       0               0         (0/0/0)           0
  8       boot    wu       0 -    0        7.84MB    (1/0/0)       16065
  9 unassigned    wm       0               0         (0/0/0)           0

format> 

You may have just overwritten something important.

Check your partitions before you go any further. If you're lucky, you will have just wiped out some unused swap.

Incredible here is my dumpadm.conf

DUMPADM_DEVICE=/dev/dsk/c0t0d0s2
DUMPADM_SAVDIR=/var/crash/my_box
DUMPADM_CONTENT=kernel
DUMPADM_ENABLE=yes

I have now taken all of your advice and made slice3 my dump device

DUMPADM_DEVICE=/dev/dsk/c0t0d0s3
DUMPADM_SAVDIR=/var/crash/my_box
DUMPADM_CONTENT=kernel
DUMPADM_ENABLE=yes

format> verify
Primary label contents:
Volume name = < >
ascii name = <FUJITSU-MAW3073NCSUN72G-1703 cyl 14087 alt 2 hd 24 sec 424>
pcyl = 14089
ncyl = 14087
acyl = 2
nhead = 24
nsect = 424
Part Tag Flag Cylinders Size Blocks
0 root wm 413 - 14075 66.30GB (13663/0/0) 139034688
1 swap wu 0 - 412 2.00GB (413/0/0) 4202688
2 backup wm 0 - 14086 68.35GB (14087/0/0) 143349312
3 unassigned wm 0 - 412 2.00GB (413/0/0) 4202688
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 14076 - 14086 54.66MB (11/0/0) 111936
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0
 
 

Thanks for the help guys

So if there was more going on in my system, the crash dump could have potentially exceeded my physical memory and swap space which would then start to overwrite data on S2 (which is everything, so technically anything after my swap?)

1 swap wu 0 - 412 2.00GB (413/0/0) 4202688

so any data after cylinder 412 would start to be overwritten by the crash dump?

is this the reason using slice2 is not recommended?

Slice 3 is using the same disk space as the swap partition.

Your may have got away with the dump going to slice 2 because the whole disk starts with 2GB of swap space so as long as the dump was less than 2GBytes then actual OS would be left untouched but as you say if the dump went beyond slice 412 there would be corruption.

This article:
The Linux and Unix Menagerie: How Avoid Solaris Panics When Using Savecore With Veritas Volume Manager
says that Sun recommends using the swap device as your dump device so s1 itself should be okay?

Thanks for the link Tony,

your post helped me understand it

the link was interesting as well ^^