Failover Clustering - Google Cloud
Failover cluster is a group of independent computers, working together to increase the availability and scalability of clustered roles. The cluster services are connected by physical cables and by software. If one or more of the cluster nodes fails, another node begins to provide service.
Introduction of Failover Clustering
It is a group of servers that work to maintain the high availability services and applications.
If one of the servers, or nodes fails, another node in the cluster can take over its workload without any downtime, this is called failover.
It is widely used in Windows Server on Google Cloud Platform (GCP). If a cluster node fails then another node can take over running the software applications.
This is an architecture example of failover clustering. The example system contains the following three servers:
Primary compute engine of VM instances running Windows Server 2016.
Second instance, configured to match the primary instance.
An AD domain name server (DNS). This server:
Provides a Windows domain.
Resolves hostnames to IP addresses.
Hosts the file share witness
Create a network.
Install Windows Server 2016 on two Compute Engine VMs.
Install and configure Active Directory on an instance of Windows Server.
Setup the failover cluster and including a file share witness for the quorum and a role for the workload.
Setup the internal load balancer phase.
Test all tasks of failover operation to verify that the cluster is working.
Why Failover Clustering
HA (High Availability) file share storage for applications such as Microsoft SQL Server and Hyper-V virtual machines
In this Highly available clustered roles that run on physical servers or on virtual machines that are installed on servers running Hyper-V.
What happens during a failover
In this Windows failover cluster changes the status of the active node to indicate that it has failed.
Failover clustering moves any cluster resources and roles from the failing node to the best node, as defined by a quorum. This action includes moving the associated IP addresses.
Failover clustering broadcasts the ARP packets to notify hardware-based network routers that the IP addresses have moved. For this scenario, GCP networking ignores these packets.
After that, this Compute Engine agent on the VM for the failing node changes its response to the health check from 1 to 0, because the VM no longer hosts the IP address specified in the request.
This Compute Engine agent on the VM for the newly active node likewise changes its response to the health check from 0 to 1.
The internal load balancer stops route traffic to the failing node and instead routes traffic to the newly active node.
Advance features of Failover Clustering
Cluster Operating System Rolling Upgrade - It enables the operating panel to upgrade the operating system of the cluster nodes from Windows Server 2012 R2 to a newer version without stopping the Hyper-V or Scale-Out File Server workloads.
Storage Replica - It is a new feature that enables storage-agnostic, block-level, synchronous replication between servers or clusters for disaster recovery, as well as stretching of a failover cluster between sites.
Cloud Witness - It is a new type of Failover Cluster quorum witness in Windows Server 2016 that leverages Microsoft Azure as the arbitration point.
Virtual Machine Load Balancing - It facilitates the seamless load balancing of virtual machines across the nodes in a cluster. These nodes are identified based on virtual machine Memory and CPU utilization on the node.
Workgroup and Multi-domain clusters - Windows Server 2016 breaks down these barriers and introduces the ability to create a Failover Cluster without Active Directory dependencies. Create failover clusters in the following configurations:
Single-domain Clusters - All cluster nodes joined to the same domain.
Multi-domain Clusters - Clusters nodes which are members of different domains.
Workgroup Clusters - Clusters with nodes which are member servers or workgroup.