How to solve M5000 CPU "Deconfigured" state?

Hi Community,

i have one M5000 spare machine which was handled by support team. they told me that it is gone completely .

i have checked the status. before it was showing MBU_B degraded. i updated to latest firmware and , resetted the xscf and now this is showing as normal.

MBU_B Status:Normal; Ver:4401h; Serial:BD131500KA  ;
        + FRU-Part-Number:CF00541-4360 01   /541-4360-01          ;
        + Memory_Size:128 GB;
        + Type:2;
        CPUM#0-CHIP#0 Status:Normal; Ver:0601h; Serial:PP120405RT  ;
            + FRU-Part-Number:CA06761-D205 C3   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
        CPUM#0-CHIP#1 Status:Normal; Ver:0601h; Serial:PP120405RT  ;
            + FRU-Part-Number:CA06761-D205 C3   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
        CPUM#1-CHIP#0 Status:Normal; Ver:0601h; Serial:PP11320266  ;
            + FRU-Part-Number:CA06761-D205 C2   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
        CPUM#1-CHIP#1 Status:Normal; Ver:0601h; Serial:PP11320266  ;
            + FRU-Part-Number:CA06761-D205 C2   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
*       CPUM#2-CHIP#0 Status:Deconfigured; Ver:0601h; Serial:PP1038086R  ;
            + FRU-Part-Number:CA06761-D205 C0   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
*       CPUM#2-CHIP#1 Status:Deconfigured; Ver:0601h; Serial:PP1038086R  ;
            + FRU-Part-Number:CA06761-D205 C0   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
*       CPUM#3-CHIP#0 Status:Deconfigured; Ver:0601h; Serial:PP11050AZH  ;
            + FRU-Part-Number:CA06761-D205 C1   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;
*       CPUM#3-CHIP#1 Status:Deconfigured; Ver:0601h; Serial:PP11050AZH  ;
            + FRU-Part-Number:CA06761-D205 C1   /371-4932-03          ;
            + Freq:2.660 GHz; Type:48;
            + Core:4; Strand:2;

what is Status:Deconfigured means?

Thanks & Regards,
Ben

As a result of a hardware fault, the two CPU's are "deconfigured".

The CPU's themselves are not the problem but some other faulty component is preventing them working.

FYI, here's the information from the horse's mouth (Oracle):-
Operation of the Server

1 Like

Hi

thanks for the replay.

but from showhardconf is showing nothing is faulty for me

And what is the output of showstatus ?

HI

my showhardconf is showing one memory bank is faulted, that means all the memory modules on that board is gone or what ?

 
*       MEMB#4 Status:Faulted; Ver:0101h; Serial:BF0841L98R  ;
            + FRU-Part-Number:CF00541-0545 06   /541-0545-06          ;
*           MEM#0A Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00006006;
                + Type:4B; Size:4 GB;
*           MEM#0B Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00001007;
                + Type:4B; Size:4 GB;
*           MEM#1A Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00001016;
                + Type:4B; Size:4 GB;
*           MEM#1B Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00003014;
                + Type:4B; Size:4 GB;
*           MEM#2A Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00003054;
                + Type:4B; Size:4 GB;
*           MEM#2B Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00006022;
                + Type:4B; Size:4 GB;
*           MEM#3A Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-03008008;
                + Type:4B; Size:4 GB;
*           MEM#3B Status:Deconfigured;
                + Code:ad0000000000000001HYMP151P72CP4-Y5  4141-00007003;
                + Type:4B; Size:4 GB;

how can i identfy which memory module is making this issue

Regards
Ben

The first thing to try, in case it's simply a poor contact problem, is to take out each memory module and plug it back in. Then test again to see if the problem has gone.

Next, divide the memory into two and remove one half completely and test. Then insert the other half and test. See which half has the fault.

You can then swap individual modules around until you identify the faulty one.

Be sure to adhere to the memory configuration rules for your hardware. Read the manual section about memory configuration.
Also, take precautions not to let static electricity damage the electronics.