VMwareHA is rewritten from the ground up for vSphere 5.0, some important features to bring to light are as follows.
• Provides a foundation for increased scale and functionality
• Eliminates common issues (DNS resolution)
• Multiple Communication Paths
• Can leverage storage as well as the mgmt network for communications
• Enhances the ability to detect certain types of failures and provides redundancy
• IPv6 Support
• Enhanced Error Reporting
• One log file per host eases troubleshooting efforts
• Enhanced User Interface
• Enhanced Deployment Mechanism
One of the major changes with VMwareHA 5.0 is the rewrite of the underlying code. AAM was the agent in 4.x which stands for “Automated Availability Manager” was responsible for communicating resource information, HA properties to other nodes in the cluster as well as virtual machine states. AAM also is responsible for failure/isolation heartbeats. With vSphere 5.0 there is no longer the AAM agent this has now been replaced by FDM agent or Fault Domain Manager. This agent is important because the concept of Primary/Slave have also gone and replaced with a Master/Slave concept of which FDM plays a major part. There is now only one Master in the cluster on which the FDM agent is set as a Master role, on all other nodes FDM agents on those nodes are changed to Slave roles. One of the Slave nodes can be promoted to a Master if the original Master node fails.
The Master continues to monitor the availability of ESXi 5 hosts and also gathers information on the VM availability. As the Master agent monitors all Slave nodes and in case this slave host fails, all of the VMs on that nodes are restarted on another node.
If the Master Node fails then there is a re-election process and the host which has access to the largest number of Datastores is elected as a master. There is a really good reason for this as there is a new feature which allows you to communicate via Datastores for heartbeating
This communication via the secondary channel through Datastores is known as a Heartbeat Datastores. This secondary network is not used though in normal situations, it will only be used if the primary network goes down. This secondary channel also allows the Master to be aware of all Slave nodes and also the VMs running on those hosts. The Heartbeat Datastores can also determine if host has become isolated or if a network partition has occurred for that host.
The Master node also sends reports states to vCenter. Information from the slaves which monitor the state of their running VMs is sent to the Master also the slaves are notified if the Master is alive via heartbeats. The Slave sends heartbeats to master and if master should fail then that’s when the re-election process occurs. vCenter will know if a new Master is elected as the Master will inform vCenter when its process has finished.