3.4 Configuring Clustering

You can cluster the CloudAccess appliance. By default, it is a single node cluster, but CloudAccess supports up to a five-node cluster. You add a node to the cluster by selecting Join Cluster during the initialization process.

3.4.1 Advantages of Clustering

Clustering in CloudAccess offers several advantages. Most of these advantages are available only if you configure an L4 switch or Round-robin DNS. The L4 switch is the best solution.

Disaster Recovery: Adding additional nodes to the cluster provides disaster recovery for your appliance. If one node stops running or becomes corrupt, you can promote another node to master.

High Availability for Authentications: CloudAccess provides high availability for authentications and the single sign-on service, when using an L4 switch in conjunction with clustering. This solution allows users to authenticate in case of problems with the nodes within the cluster. The L4 switch sends authentication requests to the nodes with which it can communicate.

Load Balancing: You can configure the L4 switch to distribute authentications to nodes so one node does not receive all authentication requests while other nodes sit idle.

Scalability: Configuring an L4 switch with clustering increases the scalability of CloudAccess. Each node in the cluster increases the number of possible concurrent logins.

3.4.2 Adding Nodes to the Cluster

CloudAccess supports up to five nodes in a cluster. You add nodes to the cluster through the initialization process, and perform all other initialization tasks on the Admin page.

To add a node to the cluster:

  1. Verify that the cluster is healthy.

    • All nodes must be running and communicating.

    • All components must be in a green state.

    • All failed nodes must be removed from the cluster.

    For more information about verifying that your cluster is healthy, see Section 13.4, Troubleshooting Different States.

  2. Download and deploy a new virtual machine (VM) for the new node.

    For more information, see Section 2.4, Deploying the Appliance.

  3. Initialize the appliance. Select Join Cluster as the first step to initialize the new node, then follow the on-screen prompts.

    For more information, see Section 2.6, Initializing the Appliance.

    When initialization is complete, the browser is redirected to index.html and a login page appears.

  4. Log in to index.html and verify that the new appliance appears in the cluster. Wait until all spinner icons stop processing and all components are green before performing any other tasks.

    The cluster is adding the node and there are several background processes running. This final step could take up to an hour to complete.

  5. After the node is added to the cluster, register the node. For more information, see Section 3.2, Registering the Appliance.

3.4.3 Promoting a Node to Master

The first node that you install is the master node of the cluster by default. The master node runs provisioning, reporting, approvals, and policy mapping services. You can promote any node to become the master node.

To promote a node to master:

  1. Verify that the cluster is healthy.

    • All nodes must be running and communicating.

    • All components must be in a green state.

    • All failed nodes must be removed from the cluster.

    For more information about verifying that your cluster is healthy, see Section 13.4, Troubleshooting Different States.

  2. Verify that all nodes in the cluster are running the same version of CloudAccess. If any nodes need to be updated, ensure that you update the nodes before you switch the master node. For more information, see Section 12.4, Updating the Appliance.

  3. Take a snapshot of the cluster.

  4. Click the node to become the master node on the Admin page, then click Promote to master.

    An M appears on the front of the node icon indicating it is now the master node. This process might take a while to complete. Watch for the node spinner icons to stop and Health indicators to turn green before proceeding with any additional configuration changes.

The services move from the old master to the new master. The old master is now just a regular node in the cluster.

WARNING:

  • If the old master node is down when you promote another node to master, remove the old master from the cluster, then delete it from the host server. Otherwise, the appliance sees two master nodes and becomes corrupted.

  • When you switch the master node, the logs start again on the new master and reports start again on the new master. The historical logs are lost. The reporting data is also lost, unless you are using Sentinel Log Manager. For more information, see Section 10.2, Integrating with Sentinel Log Manager.

3.4.4 Removing a Node from the Cluster

You can remove a node from the cluster if something is wrong with the node. However, ensure that you use the following steps to properly remove the node. If you simply delete a node from the cluster, the appliance deletes the node from the interface, but the virtual image still exists and continues to run. Leaving the virtual image running allows users to authenticate to a node that does not exist on the Admin page.

NOTE:After you remove a node, you cannot add the same VM instance back into the cluster. You must delete this instance of the appliance from your host server, then deploy another instance to the host server to add a node back into the cluster.

To remove a node from the cluster:

  1. (Conditional) If the node you are removing is the master node, promote another node to be master before you remove the old node. For more information, see Section 3.4.3, Promoting a Node to Master.

  2. (Conditional) If you are using an L4 switch, delete the node from the L4 switch. For more information, see the L4 switch documentation.

  3. On the Admin page, click the node you want to remove from the cluster.

  4. Click Remove from cluster.

    The Admin page immediately shows that the node is gone, but it takes some time for the background processes to finish.

  5. Stop the virtual image on the host server, and then delete the instance of the node from the host server.

3.4.5 Configuring an L4 Switch for Clustering

If you want high availability or load balancing, you must configure an L4 switch for the CloudAccess appliance. An L4 switch can be configured in many different ways. Use the following recommendations to configure the L4 switch to work with the appliance:

  • Heartbeat: Use the following URL to define the heartbeat for the L4 switch:

    https://appliance_ip_address/osp/h/heartbeat
    

    The L4 switch uses the heartbeat to determine if the nodes in the cluster are running and working properly. The heartbeat URL returns a text message of Success and an HTTP 200 response code.

  • Persistence: Also known as sticky sessions, persistence allows all subsequent requests from a client to be sent to the same node. To make this happen, select SSL session ID persistence when configuring the L4 switch.

Session persistence ensures that the same real server is used for the CloudAccess login and the subsequent application single sign-on. Using the same server allows caching for a series of related transactions, which can improve the server performance and reduce the latency of transactions. It removes the delay that might occur if the client sends a request to a new node instead of using the existing session to the same node. To ensure that transactions for the same client are forwarded to the same real server in a load-balanced cluster configuration:

  • You can set the L4 switch to use IP-based persistence, which uses the user device’s IP address to maintain an affinity between the user session and the same real server in the cluster. IP-based persistence fails if a user’s device IP address changes between requests, such as if a user’s mobile device changes networks during a session. It also fails if all user devices come through a proxy service where all transactions appear to come from the same IP address.

  • You can set the L4 switch to use sticky-bit persistence. However, note that sticky-bit persistence is problematic for L4 switches that do not support stickiness. Sticky sessions also do not work with browsers set to disable cookies.

  • You can use a proxy approach for the identity provider nodes that does not depend on the L4 configuration. However, this solution can quickly become chatty.