Skip to content
Menu
  • CloudThesis-Home
  • Cloud Management
    • VMware Cloud
    • VCF (VMware Cloud Foundation)
    • vRO
    • vRA
    • vRealize Log Insight
    • vCloud Director
    • Hybrid Cloud Extension (HCX)
    • vCloud
    • Cloud on AWS
  • VMware
    • vCenter
    • vSphere
      • Generic vSphere Commands
    • ESXi
    • VMware vRealize Suite Lifecycle Manager –8 Deployment Guide
    • VMware Site Recovery Manager (SRM)
    • vRealize Network Insight
    • VMware Horizon
  • Power CLI
    • PowerCLI Snippets
  • NSX
  • vSAN
    • vSAN 7
  • Nutanix
  • VOIP
  • Microsoft
    • Azure Cloud
    • Microsoft Windows Server
    • Hypervisor
    • Create Azure VMs (Portal)
    • Create Azure VMs – PowerShell
  • Author
  • Contact

Purple Screen of Death (PSOD) Analyze .

Posted on June 18, 2019

 1)  What is PSOD ?

A Purple Screen of Death (PSOD) is a diagnostic screen with white type on a purple background that is displayed when the VMkernel of an ESX/ESXi host experiences a critical error, becomes inoperative and terminates any virtual machines that are running.

 

You will be able to see that screen  on the console of the server. You  need to either be in the datacenter and connect a monitor or remotely using the server’s out-of-band management (iLO, iDRAC, IMM… depending on your vendor).

PSOD

 

2) Why PSOD ?

It  is a stage of kernel panic.  The ESXi kernel (vmkernel) triggers this safety measure in response to an error which is unrecoverable and would mean it continuing to run would pose a high risk for the services and VMs. when the ESXi hosts feels it became corrupted, and display the purple screen with Long message and Code .It appeared due to Hardware (RAM or CPU ) failures. They normally throw out a Machine Check Exception “MCE” or non-maskable interrupt “NMI” error.

3) Impact of PSOD ?

When the kernal is in panic stage then host is crash and it terminate all the services immediately . The VMs are not gracefully shutdown, but rather abruptly powered off. If the host is part of a cluster and you’ve configured HA, these VMs will be started on the other hosts in the cluster. If the host is part of vSAN cluster it means PSOD will impact vSAN also .

4) Analyze PSOD message .

First step to take a screenshot , you can take this remotely (IMM, iLO, iDRAC as per vendor )  .

psod

seven info you need to know  in PSOD screen :

1- That is Product and Build No .

2- That is error message .

3- Physical CPU register at the time of error .

4- The Physical CPU

5- The host Uptime

6- The stack trace (stage of VMkernal at the time of error )

7- The core dump

 

Few PSOD errors with KB article .

 Error  KB Article
LINT1/NMI (motherboard nonmaskable interrupt), undiagnosed Using hardware NMI facilities to troubleshoot unresponsive hosts (1014767)
Panic requested by one or more 3rd party NMI handlers
COS Error: Oops Understanding an “Oops” purple diagnostic screen (1006802)
Lost Heartbeat Understanding a “Lost Heartbeat” purple diagnostic screen (1009525)
ASSERT bora/vmkernel/main/pframe_int.h:527 Understanding ASSERT and NOT_IMPLEMENTED purple diagnostic screens (1019956)
NOT_IMPLEMENTED /build/mts/release/bora-84374/bora/vmkernel/main/util.c:83 Understanding ASSERT and NOT_IMPLEMENTED purple diagnostic screens (1019956)
Spin count exceeded (iplLock) - possible deadlock Understanding a “Spin count exceeded” purple diagnostic screen (1020105)
PCPU 1 locked up. Failed to ack TLB invalidate Understanding a Failed to ack TLB invalidate purple diagnostic screen (1020214)
#GP Exception(13) in world 4130:helper13-0 @ 0x41803399e303 Understanding Exception 13 and Exception 14 purple diagnostic screen events (1020181)
#PF Exception type 14 in world 136:helper0-0 @ 0x4a8e6e
Machine Check Exception: Unable to continueHardware (Machine) Error Decoding Machine Check Exception (MCE) output after a purple screen error (1005184)
Hardware (Machine) Error
PCPU: 1 hardware errors seen since boot (1 corrected by hardware)

Check logs : Few example of PSOD log  .

Components Location What is it
System messages /var/log/syslog.log Contains all general log messages and can be used for troubleshooting.
VMkernel /var/log/vmkernel.log Records activities related to virtual machines and ESXi. Most PSOD relevant entries will be in this log, so pay special attention to it.
ESXi host agent log /var/log/hostd.log Contains information about the agent that manages and configures the ESXi host and its virtual machines.
VMkernel warnings /var/log/vmkwarning.log Records activities related to virtual machines. Watch for heap exhaustion(Heap WorkHeap) related log entries.
vCenter agent log /var/log/vpxa.log Contains information about the agent that communicates with vCenter, so you can use it to spot tasks triggered by the vCenter and might have caused the PSOD.
Shell log /var/log/shell.log Contains a record of all commands typed, so you can correlate the PSOD to a command executed.

 

Thanks hope you like it.

Rajiv Pandey.

 

 

 

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search Topic

Categories

  • Azure Cloud
  • Cloud Management
  • Cloud on AWS
  • Create Azure VMs – PowerShell
  • Create Azure VMs (Portal)
  • ESXi
  • Generic vSphere Commands
  • Hybrid Cloud Extension (HCX)
  • Hypervisor
  • Microsoft
  • Microsoft Azure Cloud and Services
  • Microsoft Windows Server
  • NSX
  • Nutanix
  • Power CLI
  • PowerCLI Snippets
  • vCenter
  • VCF (VMware Cloud Foundation)
  • vCloud
  • vCloud Director
  • VMware
  • VMware Cloud
  • VMware Horizon
  • VMware Site Recovery Manager (SRM)
  • VMware vRealize Suite Lifecycle Manager – 8.0
  • vRA
  • vRealize Log Insight
  • vRealize Network Insight
  • vRO
  • vSAN
  • vSAN 7
  • vSphere
  • Windows Servers
©2025 | WordPress Theme by Superbthemes.com