Configuring JGroups and Alfresco Clusters

Document created by resplin Employee on Jun 6, 2015
Version 1Show Document
  • View in full screen mode

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com



High Availability3.1


Introduction


*** JGroups functionality was introduced in Enterprise V3.1 ***

*** Cache communication using JGroups is an Enterprise specific feature not available on Alfresco Community ***

Alfresco requires servers to discover each other on a network in order to set up cluster communications.  Before V3.1, this discovery process was done using a UDP multicast message (provided by EHCache); servers in the cluster picked the message up and used the information to set up inter-server communication for inter-cache communication.

Alfresco limited some installations by not providing a more flexible cluster discovery process, which is why JGroups was integrated into the repository.  JGroups is a toolkit for multicast communication between servers.  It allows inter-server communication using a highly configurable transport stack, which includes UDP and TCP protocols.  Additionally, JGroups manages the underlying communication channels and cluster entry and exit.

This page covers the options available for configuring JGroups and other Alfresco-specific cluster options for V3.1.


Core JGroups Support


JGroups is supported in both the Open and Enterprise code lines, but is only used for cache communication in Enterprise.  The core setup of JGroups is therefore common to both code streams.


JGroups Configuration


NB: The JGroups cluster will only initialize if the following property is defined:

 alfresco.cluster.name=<CLUSTERNAME>

Once the cluster name is specified, JGroups uses that name to uniquely distinguish inter-server communication.  This allows machines to use the same protocol stacks, but to ignore broadcasts from different clusters.

The default JGroups configuration files are located at <configRoot>/alfresco/jgroups/alfresco-jgroups-XYZ.xml, where XYZ is the name of the protocol being used.  There is a separate configuration file for each stack.   Switch between the default TCP and UDP stacks using:

 alfresco.jgroups.defaultProtocol=<STACKNAME>

By default the UDP stack is used. You must explicitly configure JGroups to use TCP.

You can also point to a completely new configuration file of your own making using:

 alfresco.jgroups.configLocation=classpath:some-classpath.xml
alfresco.jgroups.configLocation=file:some-file-path.xml

Within the JGroups configuration files, there are parameters that can be passed, for example, see <configRoot>/alfresco/jgroups/alfresco-jgroups-TCP.xml:



<config>
    <TCPPING timeout='3000'
             initial_hosts='${alfresco.tcp.initial_hosts:localhost[7800]}'
             port_range='${alfresco.tcp.port_range:3}'
             num_initial_members='2'/>
    </config>

The variable substitution if provided natively by JGroups; properties are taken from the JVM's system properties.  In order to support setting JGroups properties in either the system properties or directly on the command line - and not only the latter - Alfresco has a bean that pushes the JGroups properties from the system properties into the JVM properties:

<configRoot>/alfresco/core-services-context.xml:



    <bean id='jgroupsPropertySetter' class='org.alfresco.config.SystemPropertiesSetterBean' init-method='init'>
        <property name='propertyMap'>
            <map>
                <entry key='jgroups.bind_addr'>
                    <value>${alfresco.jgroups.bind_address}</value>
                </entry>
                <entry key='jgroups.bind_interface'>
                    <value>${alfresco.jgroups.bind_interface}</value>
                </entry>
                <entry key='alfresco.tcp.start_port'>
                    <value>${alfresco.tcp.start_port}</value>
                </entry>
                <entry key='alfresco.tcp.initial_hosts'>
                    <value>${alfresco.tcp.initial_hosts}</value>
                </entry>
                <entry key='alfresco.tcp.port_range'>
                    <value>${alfresco.tcp.port_range}</value>
                </entry>
                <entry key='alfresco.udp.mcast_addr'>
                    <value>${alfresco.udp.mcast_addr}</value>
                </entry>
                <entry key='alfresco.udp.mcast_port'>
                    <value>${alfresco.udp.mcast_port}</value>
                </entry>
            </map>
        </property>
    </bean>

If the JGroups config is changed or extended and the preferred means of setting properties is NOT via the system (-D options) properties, then override the bean or add a new bean in your extension location.  Values set directly on the VM using the -D options are always taken in preference in this case.


JGroups Properties


  • General
    • alfresco.jgroups.bind_address (default: ): The address to bind local sockets to
    • alfresco.jgroups.bind_interface (default: ): The interface to bind local sockets to
  • UDP Stack
    • alfresco.udp.mcast_addr (default: 230.0.0.1): The multicast address to broadcast on
    • alfresco.udp.mcast_port (default: 4446): The port to use for UDP multicast broadcasts
    • alfresco.udp.ip_ttl (default: 2): The multicast 'time to live' to control the packet scope
  • TCP Stack
    • alfresco.tcp.start_port
      • The port that the server will start listening on
      • default: 7800
      • e.g.: 7800
    • alfresco.tcp.initial_hosts
      • A list of hosts and start ports that must be pinged.  This can all potential members of the cluster, including the current server and servers that might not be available.  The port listed in square brackets is the port to start pinging.
      • default: localhost[7800]
      •   e.g.: HOST_A[7800],HOST_B[7800]
    • alfresco.tcp.port_range
      • The number of increments to make to each host's port number during scanning.  Each host has a port number in square brackets e.g. 7800.  If the host does not respond to the ping message, the port number will be increased and another attempt made.

Clustering the Repository


What Has Changed?


  • JGroups is now used to send the initial broadcast messages announcing a server's availability.
  • After initial setup, it is possible to have a server enter a cluster by setting a single property: alfresco.cluster.name.

Steps to Initiate Clustering


  1. Activate the ehcache-custom.xml.sample.cluster
    • This has been modified and you should take the latest version that ships with V3.1.
  2. Set the required properties using overrides (e.g. custom-repository.properties) or system properties (Java -D options)
    • alfresco.cluster.name: The name of the cluster.  This is new and critical.  Without it, neither JGroups nor index tracking will be enabled.
    • Any JGroups properties, especially if the TCP stack is required
    • index.recovery.mode: Set this to AUTO to ensure indexes are refreshed properly on startup
    • index.tracking.cronExpression: This does not need to be set and is 0/5 * * * * ? by default.  The index tracking code will not activate unless the cluster name has been set!
  3. Configure the content stores
    • This has not changed from previous versions

If You Do Not Want JGroups


It is still possible to use the EHCache multicast discovery.  Replace the cacheManagerPeerProviderFactory in your EHCache custom config file as follows:

<extensionRoot>/alfresco/extension/ehcache-custom.xml:



    <cacheManagerPeerProviderFactory
            class='net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory'
            properties='peerDiscovery=automatic,
                        multicastGroupAddress=230.0.0.1,
                        multicastGroupPort=4446'/>

You will still need to set the alfresco.cluster.name property in order to activate index tracking.


Logging


  • org.alfresco.enterprise.repo.cache.jgroups
    • INFO: Watch entry and exit of cluster members
    • DEBUG: Verbose output on heartbeat messages sent and received by machines in the cluster as well as the above
  • More...

Is it Working?


The cluster checks have not changed: Testing the Cluster




Cluster node discovery when using a TCP stack


When IP multicasting is not enabled, or cannot be used for other reasons, one has to use the TCP stack config file.
this stack config defaults to using TCPPING, which lists nodes statically in the config, through the use of the alfresco.tcp.initial_hosts property. 

However, this is a bit cumbersome when you want to add new nodes, with regard to configuration, restarting nodes, etc... 



JGroups provides alternatives for node discovery when IP multicasting cannot be used, such as : 


  • TCPGOSSIP : an external lookup service. has the disadvantage of being an external process to maintain, which might not be redundant. (Available in Alfresco Enterprise 3.1+, as it ships JGroups 2.6.2)
  • FILE_PING : easy to setup, uses a shared directory for nodes to automatically add/remove their info from. easy discovery mechanism. probably a bit slower than network messaging, but node discovery usually does not happen very often anyways. (Available in Alfresco Enterprise 3.1.2+, as it ships JGroups 2.8.0-b2)
  • JDBC_PING : uses a DB for adding/removing node info. can specify either DB connection properties or a JNDI datasource. (Available in Alfresco Enterprise 3.4.1+, as it ships JGroups 2.11.1.Final)
  • S3_PING : Amazon EC2 specific, allows to place node info into a S3 bucket. (Available in Alfresco Enterprise 3.4.1+, as it ships JGroups 2.11.1.Final)

When using the FILE_PING mechanism, each node on startup will add/remove its information in the shared directory when joining/leaving the cluster.
New node membership will still be printed out in the logfile as usual, and you can monitor the directory for node membership, per channel.




Example (pre 3.4.11 / 4.1.1)


To use FILE_PING instead of TCPPING when using the TCP config :


  • on each node, copy the default .../WEB-INF/classes/alfresco/jgroups/alfresco-jgroups-TCP.xml to, for example, .../shared/classes/alfresco/extension/jgroups/custom-alfresco-jgroups-TCP.xml
  • replace TCPPING with FILE_PING, like in the example below. The file ping location is shared amongst all nodes. Therefore note that it must be a directory accessible by all nodes, e.g. a local directory if all nodes are on the same machine, or a shared network directory if nodes are on different hosts (most likely) :

    ...

    <FILE_PING location='/mnt/jgroups' />
    ...

  • reference the new config file in alfresco-global.properties, example :

alfresco.jgroups.defaultProtocol=TCP
alfresco.jgroups.configLocation=classpath:alfresco/extension/jgroups/custom-alfresco-jgroups-${alfresco.jgroups.defaultProtocol}.xml




Example (3.4.11+ / 4.1.1+)


In 3.4.11+ and 4.1.1+, this mode is now included as an option out of the box. You no longer need to copy / modify the jgroups XML config.
You can simply configure it in alfresco-global.properties, with, for example :



alfresco.jgroups.defaultProtocol=TCP-FPING
alfresco.fping.shared.dir=${dir.root}/jgroups




Additional Resources


See :




For more information.

Attachments

    Outcomes