DDF Application & Configuration Clustering

Introduction

An essential part to the DDF Clustering solution is the ability to manage applications and configurations among cluster nodes. The following documentation will help in setting up a set of DDF server nodes in a cluster, and allow an administrator to manage the applications and configurations that are deployed.

Setup of the DDF Cluster

Initial Setup

The goal of the DDF Cluster solution is to keep applications and configurations synchronized on every DDF server. When setting up a DDF server, it is recommended that only one server is setup per machine (virtual or non-virtual). This means that for each physical machine or virtual machine (VM) that is setup, only one instance of the DDF is deployed on it. This configuration provides the least complexity in configuration of the DDF Cluster and allows the DDF server itself to utilize the resources of the entire system. It is also recommended that all DDF servers and systems be configured the same. This means that all DDF server platforms have the same configurations and applications running initially. Also, ensure that all physical and virtual systems running the DDF servers have the same configuration (e.g. operating system, memory, CPU, etc.).

Starting and Configuring New DDF Server Node Clusters

Before starting your DDF cluster nodes, you must first setup your node network configurations. See the Configuring DDF Clustering For Multicast & Unicast section for more information. To view all of the available nodes, navigate in your browser to the DDF Web Console and click on the tab labeled “Clustered Groups”. Under the group “default”, you should see all running DDF nodes that have been clustered. It is also possible to move all instances into a named group. In order to perform this action, you must have access to one of the DDF command line shells directly or utilize Gogo located at http://<serverip>8181/system/console/gogo where serverip is the IP address or host name of one of the DDF nodes.

The first action that will be performed is the creation of a new group. To create a new cluster group named “mygroup”, execute the following command:

cluster:group-create mygroup

This command will create a new cluster group which will not contain any nodes. Execute the following command to view all groups:

cluster:group-list

You should see the following output:

Group Members
* [default ] [192.168.1.110:5701* ]
[mygroup ] []

As you can see, there are now two groups, the default group and the new group that you have just created. We now need to move the DDF node out of the “default” group and into “mygroup”. Execute the following commands:

cluster:group-join mygroup 192.168.1.110:5701

cluster:group-quit default 192.168.1.110:5701

The first command allows the DDF node to join the new group. The second command removes the DDF node from the “default” group.

Group Members
[default ] []
* [mygroup ] [192.168.1.110:5701* ]

These commands should be executed on all “production” nodes located in the default group. Note: The default group will always remain and cannot be deleted. It is possible for DDF nodes to be separated into multiple groups. It is also possible for one DDF node to exist in multiple groups. These are advanced topics and can be addressed in additional documentation links.

Adding an Additional DDF Server node to the Cluster

There may be a need to add an additional DDF server node after an existing DDF cluster has already been configured and deployed. It is important that when adding additional nodes, the new node must match the existing nodes in terms of applications and configurations. Therefore it is good practice to copy one of the existing nodes and push that copy to a new virtual machine or server machine instance. This will provide a stabler transition of the new node into the cluster. Once you have setup the new node, you can follow the instructions above to add the new node into the cluster group, if needed.

Managing Applications

Application management within the cluster has been designed to function as if you were only managing one instance of DDF. All DDF applications are handled at the feature level. Therefore, the Features management console can be used to manage all applications within the DDF Cluster. The Features management console can be accessed by navigating to http://<serverip>:8181/system/console/features in a web browser, where serverip is the IP address or host name of one of the DDF nodes. You may have to fill in credentials (username and password) to access the console.

From the Features console, the following functions are available:

Provides a listing of the available features and repositories in the cluster
Add new repositories to the cluster
Remove existing repositories from the cluster
Install features from repositories into the cluster
Uninstall or remove features from the cluster

For more information on using the console, refer to the DDF User documentation.

Managing Configurations

Just as with Application Management, managing configurations within a cluster was designed to function as if you were configuring one instance of DDF. There are two areas where DDF configurations can be modified. Most modifications will occur within the DDF Configurations management console. The management console can be accessed by navigating to http://<serverip>:8181/system/console/configMgr in a web browser, where serverip is the IP address or host name of one of the DDF nodes. You may have to fill in credentials (username and password) to access the console.

From the Configuration console, the following features are available:

Provides a listing of the available configurations
Edit configuration values in the cluster
Unbind the configuration from the bundle
Remove the configuration from the cluster

For more information on using the console, refer to the DDF User documentation.

Controlling Feature & Configuration Synchronization

There may be instances where certain configurations are to remain local to a certain DDF node. This behavior can be controlled through the cellar groups configuration. To open this configuration, navigate to http://<serverip>:8181/system/console/configMgr in a web browser, where serverip is the IP address or host name of one of the DDF nodes. You may have to fill in credentials (username and password) to access the console. Within the configuration list, search for the configuration with name “org.apache.karaf.cellar.groups”. Click on the configuration to view / edit. Within the file you will see many configurations listed with the following format:

[cluster group name] . [configuration type (e.g. feature, configuration, etc.)]. [list type] = [values]

These configurations allow you to control the synchronization of features and configurations through blacklists and whitelists. If you do not want a specific feature or configuration to propagate throughout your cluster group, you can put it into a blacklist. By default for all cluster groups, all features are in the white list:

mygroup.features.whitelist.outbound = *

with the exception of “cellar”:

mygroup.features.blacklist.outbound = cellar

As for configurations, by default all configurations are whitelisted with the exception of the following:

mygroup.config.blacklist.outbound = org.apache.felix.fileinstall*, org.apache.karaf.cellar.groups, org.apache.karaf.cellar.node, org.apache.karaf.management, org.apache.karaf.shell, org.ops4j.pax.logging

Once you have made your changes, you can save the configuration by pressing the “Save” button.

Additional Details

Configuring DDF Clustering For Unicast & Multicast

By default, DDF clustering utilizes TCP-IP unicast for discoverying other DDF nodes. The hazelcast.xml file located under <DDF root>/etc/ contains the port and address configurations for network setup. The TCP-IP unicast mode has been setup to allow for manual configuration and control of initial clustering. This configuration is also beneficial for cases where a particular network cannot support multicast or multicast has been turned off for certain reasons. There is a configuration which allows auto-discovery of DDF nodes and utilizes multicast as a transport. The hazelcast.xml file is configured like the following to allow for TCP-IP unicast discovery of cluster nodes:

<join>
 <multicast enabled="false">
 <multicast-group>224.2.2.3</multicast-group>
 <multicast-port>54327</multicast-port>
 </multicast>
 <tcp-ip enabled="true">
 <interface>127.0.0.1</interface>
 </tcp-ip>
 <aws enabled="false">
 <access-key>my-access-key</access-key>
 <secret-key>my-secret-key</secret-key>
 <region>us-east-1</region>
 </aws>
</join>

As you can see, the multicast option has been set to false and the tcp-ip option is set to true. All systems that will participate in the cluster need to have their ip addresses listed within the interface section highlighted. These modifications must be made for each node. Once these modifications have been made to the hazelcast.xml file, it is recommended that the nodes be restarted.

The following hazelcast.xml configuration would be used for multicast auto-discovery:

<join>
 <multicast enabled="true">
 <multicast-group>224.2.2.3</multicast-group>
 <multicast-port>54327</multicast-port>
 </multicast>
 <tcp-ip enabled="false">
 <interface>127.0.0.1</interface>
 </tcp-ip>
 <aws enabled="false">
 <access-key>my-access-key</access-key>
 <secret-key>my-secret-key</secret-key>
 <region>us-east-1</region>
 </aws>
</join>

As you can see, the multicast option has been set to true and the tcp-ip option is set to false. A multicast group and port can be specified in the file as highlighted above. These modifications must be made for each node. Once these modifications have been made to the hazelcast.xml file, it is recommended that the nodes be restarted.

Verifying Synchronized DDF Nodes

In most cases, the DDF system console should provide you with a listing of all features, repositories, and configurations that are installed on the cluster. There are times when the cluster can become out of sync. This instance may occur if a system has been offline for some time. One way to verify the synchronized lists of the cluster is to run cluster commands from the command line. In order to perform these actions, you must have access to one of the DDF command line shells directly or through the use of Gogo located at http://<serverip>8181/system/console/gogo where serverip is the IP address or host name of one of the DDF nodes. Once at the command line, execute the following command to see the list of deployed features for your cluster:

cluster:feature-list mygroup

This command will list the available features for your cluster group “mygroup”.

Features for cluster group mygroup
Status Version Name
[installed ] [2.1.0 ] ddf-service-ddms-transformer
[installed ] [2.2.0 ] catalog-opensearch-endpoint
......

To view the cluster group's configurations, execute the following command:

cluster:config-list mygroup

This command will show all shared configurations among the cluster group “mygroup”.

----------------------------------------------------------------
Pid: org.ops4j.pax.url.mvn
Properties:
org.ops4j.pax.url.mvn.useFallbackRepositories = false
service.pid = org.ops4j.pax.url.mvn
org.ops4j.pax.url.mvn.disableAether = true
----------------------------------------------------------------
Pid: org.apache.karaf.webconsole
Properties: ....

The following command will list all repositories associated with the cluster group “mygroup”:

cluster:features-url-list mygroup

The following will be displayed:

mvn:org.apache.cxf.karaf/apache-cxf/2.7.2/xml/features
mvn:org.apache.activemq/activemq-karaf/5.6.0/xml/features
mvn:ddf.catalog.kml/catalog-kml-app/2.1.0/xml/features
mvn:ddf.mime.tika/mime-tika-app/1.0.0/xml/features

If for any reason, any of the lists above do not match the list of features, repositories, or configurations found in the DDF system consoles, the following command can be executed:

cluster:sync

This command should allow for a DDF node to be synchronized with the rest of the cluster.

Checking For Active Nodes

Checking whether a node is active or not can be done utilizing the node ping command. In order to use this command you must have access to one of the the DDF command line shells. A list of nodes can be shown by executing the following command:

cluster:node-list

The command should show the following output:

ID Host Name Port
* [192.168.1.110:5701 ] [192.168.1.110 ] [ 5701]

The output will show the ID, host name, and port of each active DDF node in the cluster. The asterisk shows which node you are currently accessing the shell on. Now that you have a listing of node IDs, you can use these to ping other nodes. Execute the following command:

cluster:node-ping

The following result will print out until you press ctrl-c:

PING 192.168.1.110:5701
from 1: req=192.168.1.110:5701 time=9 ms
from 2: req=192.168.1.110:5701 time=4 ms
from 3: req=192.168.1.110:5701 time=2 ms
from 4: req=192.168.1.110:5701 time=3 ms
from 5: req=192.168.1.110:5701 time=4 ms
from 6: req=192.168.1.110:5701 time=2 ms
from 7: req=192.168.1.110:5701 time=2 ms
^C

The output will provide you with a typical ping result showing connectivity and response times.

DDF Application and Configuration Clustering

[data-colorid=o10b86lohc]{color:#333333} html[data-color-mode=dark] [data-colorid=o10b86lohc]{color:#cccccc}DDF Application & Configuration Clustering