T4-2 - Memory DIMM issue - ldom config resets to factory-default

Which basically means the ldoms that were on there are not starting (not even showing).

If I do ldm list-config it shows live config as next reboot. But, of course, next reboot it reverts back to factory default again.

I must admit I'm wondering if its doing this becasue (with the one faulty DIMM) there is now not enough memory to serve the LDOMS configured. Does this make sense?

Bit worrying if one DIMM failure can take out entire host :frowning:

What does

fmadm faulty

show? If replacement is needed, then do it. Once the system "thinks" a certain way about errors it is really hard to try to operate the system like the problem does not exist. As you are seeing. And in fact, doing so may cover up even more serious issues.

I'm not sure what you are actually seeing. Your response seems to me like you have no support contract more than anything else. Which is understandable, but very hard to work around sometimes.

I don't recommend this, but if you are truly desperate try using the

fmadm repair

command. I do not recommend it except as a last ditch desperation approach to getting a box going for a short time. Once the error occurs again, you are back to square one.

EDIT: let me put this another way - I had a system which showed bad memory, but when the tech search the fmadm information, he found it was a PCI-e problem. But at first blush the system "thought" it was a DIMM. Let Oracle look at it. Don't decide on your own. My first take was wrong.

It is unlikely for faulty dimm to effect SP in any way.

Configuration is saved there by issuing ldm add-spconfig <unique_name>

So, after you configure the system as per your desires or change the existing configuration, you run above command to save the work done to SP.

No need to reboot anything, but since work is saved, upon next reboot the configuration saved will be applied from SP (the latest saved).

Did you save the config followed by reboot ?
I always save, since if not saved the list option says "next poweron", probably meaning complete power cycle (not init 6 or reboot).

You should inspect, as Jim mentioned, fmadm faulty.
A properly configured system will report memory errors using that facility.

Regards
Peasant.

1 Like

I am still not sure what is going on. Peasant has good advice.

Bottom line: Why do you think you have DIMM errors?