Skip navigation
All Places > Alfresco Premier Services
fmalagrino

WCMQS Part I

Posted by fmalagrino Employee Apr 28, 2017

What is WCMQS ?

Alfresco Web Quick Start is a set of website design templates and sample architecture, built on the Alfresco Share content management and collaboration framework.

With Quick Start, developers can rapidly build customized and dynamic web applications with powerful content management features for the business users without having to start from scratch.

Using standard development tools developers can quickly deploy the comprehensive content management capabilities of Alfresco to build new and innovative web applications. Developed using the Spring framework with Alfresco Surf, the Web Quick Start allows developers to easily extend Alfresco to add new features to support the demands of the business.

Why is good to know WCMQS ?

  • WCMQS is a powerful component for creating a website
  • You can customize your website as you wish
  • You can create web scripts in JavaScript or Java
  • You can use any JavaScript frameworks like AngularJS, ExpressJS etc, and libraries like jquery and also any responsive frameworks like Bootstrap foundation etc

 

How do you Install WCMQS ?

 

There are two ways to install WCMQS in your alfresco application:

1) If you use the installer you need tick the checkbox Web Quick Start 

2) if you are installing manually Alfresco using the war then you need only to add the war WCMQS to your alfresco application.

 

For checking that WCMQS have been installed correctly on your Alfresco application you can create a collaboration site:

 

 

 

After you have created your collaboration site it is time to add the web quick start dashlet.

 

Below, the screen captures will explain how to add a dashlet on Alfresco.

 

 

Choose "Add Dashlet"

 

 

Drag the "Web Quick Start" into one of the columns

 

The added dashlet will grant you access to two prototype websites: 

Finance (single language)

Government (multilanguage)

 

In my example I select Finance.

 

After you select one of the two prototypes it will create an example website inside the document library.

 

 

Configuring WCMQS :

 

After the installation it is time to configure WCMQS.

 

By default, WCMQS create two folders: One for editorial and one for live content:

 

Editorial and Live folder must have two different host as properties configured. By default the Editorial has the host "localhost" while live is be default configured to "127.0.0.1". Both are configured on port 8080 with the context name "wcmqs"

 

 

 

 

 

If you have a different port and you want a different name you need click the Editorial or Live and modify the properties.

 

In my example, the port is 8888 (because my alfresco run in port 8888) and my website will be called test.

 

 

Same configuration for live (apart from the host name):

 

 

#Once you decided how the site is called you need go under tomcat/webapps/ copy the WCMQS.war and rename to the name of your site.war. Alfresco will create your new site. Restart tomcat and now is time to test your site going to your "host":"port"/"context" of the site in my case will be localhost:8888/test/

 

 

If you have done all correctly you should see the following site :

 

The aim of this blog is to introduce you to Enterprise Integration Patterns and to show you how to create an application to integrate Alfresco with an external application…in this case we will be sending documents on request from Alfresco to Box based on CMIS queries. We will store both the content and the metadata in Box.

 

1.    Enterprise Integration Patterns

EIP (Enterprise Integration Patters) defines a language consisting of 65 integration patterns (http://www.enterpriseintegrationpatterns.com/patterns/messaging/toc.html) to establish a technology-independent vocabulary and a visual notation to design and document integration solutions.

Why EIP? Today's applications rarely live in isolation. Architecting integration solutions is a complex task.

The lack of a common vocabulary and body of knowledge for asynchronous messaging architectures make it difficult to avoid common pitfalls.

For example the following diagram shows how content from one application is routed and transformed to be delivered to another application. Each step can be further detailed with specific annotations.

 

 

  • Channel Patterns describe how messages are transported across a Message Channel. These patterns are implemented by most commercial and open source messaging systems.
  • Message Construction Patterns describe the intent, form and content of the messages that travel across the messaging system.
  • Routing Patterns discuss how messages are routed from a sender to the correct receiver. Message routing patterns consume a message from one channel and republish it message, usually without modification, to another channel based on a set of conditions.
  • Transformation Patterns change the content of a message, for example to accommodate different data formats used by the sending and the receiving system. Data may have to be added, taken away or existing data may have to be rearranged.
  • Endpoint Patterns describe how messaging system clients produce or consume messages.
  • System Management Patterns describe the tools to keep a complex message-based system running, including dealing with error conditions, performance bottlenecks and changes in the participating systems.

 

The following example shows how to maintain the overall message flow when processing a message consisting of multiple elements, each of which may require different processing.

               

 

2.    Apache Camel

Apache Camel (http://camel.apache.org/) is an integration framework whose main goal is to make integration easier. It implements many of the EIP patterns and allows you to focus on solving business problems, freeing you from the burden of plumbing.

At a high level, Camel is composed of components, routes and processors. All of these are contained within the CamelContext .

 

The CamelContext provides access to many useful services, the most notable being components, type converters, a registry, endpoints, routes, data formats, and languages.

 

Service

Description

Components

A Component is essentially a factory of Endpoint instances. To date, there are over 80 components in the Camel ecosystem that range in function from data transports, to DSL s, data formats, and so on i.e. cmis, http, box, salesforce, ftp, smtp, etc

Endpoints

An endpoint is the Camel abstraction that models the end of a channel through which a system can send or receive messages. Endpoints are usually created by a Component and Endpoints are usually referred to in the DSL via their URIs i.e. cmis://cmisServerUrl[?options]

Routes

The steps taken to send a message from one end point to another end point.

Type Converters

Camel provides a built-in type-converter system that automatically converts between well-known types. This system allows Camel components to easily work together without having type mismatches.

Data Formats

Allow messages to be marshaled to and from binary or text formats to support a kind of Message Translator i.e. gzip, json, csv, crypto, etc

Registry

Contains a registry that allows you to look up beans i.e. use a bean that defines the jdbc data source

Languages

To wire processors and endpoints together to form routes, Camel defines a DSL. DSL include among other Java, Groovy, Scala, Spring XML.

 

3.    Building an Integration Application

 

The aim of the application is to send documents on request from Alfresco to Box. We will store both the content and the metadata in Box.

To build an EIP Application we are going to use:

  • Maven to build the application
  • Spring-boot to run the application
  • Apache Camel to integrate Alfresco and Box

 

The full source code is available on GitHub: https://github.com/miguel-rodriguez/Alfresco-Camel

 

The basic message flow is as follows:

 


 

 

3.1          Maven

Apache Maven (https://maven.apache.org/) is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.

 

3.1.1    Maven Pom.xml

For our project the pom.xml brings the required dependencies such as Camel and ActiveMQ. The pom.xml file looks like this:

 

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelVersion>4.0.0</modelVersion>

   

    <groupId>support.alfresco</groupId>

    <artifactId>camel</artifactId>

    <name>Spring Boot + Camel</name>

    <version>0.0.1-SNAPSHOT</version>

    <description>Project Example.</description>

 

    <!-- Using Spring-boot 1.4.3 -->

    <parent>

        <groupId>org.springframework.boot</groupId>

        <artifactId>spring-boot-starter-parent</artifactId>

        <version>1.4.3.RELEASE</version>

    </parent>

 

    <!-- Using Camel version 2.18.1 -->

    <properties>

        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

        <camel-version>2.18.1</camel-version>

        <app.version>1.0-SNAPSHOT</app.version>

    </properties>

 

    <!-- Spring -->

    <dependencies>

        <dependency>

            <groupId>org.springframework.boot</groupId>

            <artifactId>spring-boot-starter-web</artifactId>

        </dependency>

 

        <!-- The Core Camel Java DSL based router -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-core</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel Spring support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-spring</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel Metrics based monitoring component -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-metrics</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel JMS support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-jms</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- ActiveMQ component for Camel -->

        <dependency>

            <groupId>org.apache.activemq</groupId>

            <artifactId>activemq-camel</artifactId>

        </dependency>

 

        <!-- Camel CMIS which is based on Apache Chemistry support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-cmis</artifactId>

            <version>2.14.1</version>

        </dependency>

 

        <!-- Camel Stream (System.in, System.out, System.err) support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-stream</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel JSON Path Language -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-jsonpath</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

       <!-- Apache HttpComponents HttpClient - MIME coded entities -->

        <dependency>

            <groupId>org.apache.httpcomponents</groupId>

            <artifactId>httpmime</artifactId>

        </dependency>

 

        <!-- Camel HTTP (Apache HttpClient 4.x) support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-http4</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel SQL support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-sql</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel Zip file support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-zipfile</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Support for PostgreSQL database -->

        <dependency>

            <groupId>org.postgresql</groupId>

            <artifactId>postgresql</artifactId>

            <exclusions>

                <exclusion>

                    <groupId>org.slf4j</groupId>

                    <artifactId>slf4j-simple</artifactId>

                </exclusion>

            </exclusions>

        </dependency>

 

        <!-- Camel Component for Box.com -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-box</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- Camel script support -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-script</artifactId>

            <version>${camel-version}</version>

        </dependency>

 

        <!-- A simple Java toolkit for JSON -->

        <dependency>

            <groupId>com.googlecode.json-simple</groupId>

            <artifactId>json-simple</artifactId>

            <version>1.1.1</version>

            <!--$NO-MVN-MAN-VER$-->

        </dependency>

 

        <!-- XStream is a Data Format which to marshal and unmarshal Java objects to and from XML -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-xstream</artifactId>

            <version>2.9.2</version>

        </dependency>

 

        <!-- Jackson XML is a Data Format to unmarshal an XML payload into Java objects or to marshal Java objects into an XML payload -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-jackson</artifactId>

            <version>2.9.2</version>

        </dependency>

 

        <!-- test -->

        <dependency>

            <groupId>org.apache.camel</groupId>

            <artifactId>camel-test</artifactId>

            <version>${camel-version}</version>

            <scope>test</scope>

        </dependency>

 

        <!-- logging -->

        <dependency>

            <groupId>commons-logging</groupId>

            <artifactId>commons-logging</artifactId>

            <version>1.1.1</version>

        </dependency>

 

        <dependency>

            <groupId>org.apache.logging.log4j</groupId>

            <artifactId>log4j-api</artifactId>

            <scope>test</scope>

        </dependency>

 

        <dependency>

            <groupId>org.apache.logging.log4j</groupId>

            <artifactId>log4j-core</artifactId>

            <scope>test</scope>

        </dependency>

 

        <dependency>

            <groupId>org.apache.logging.log4j</groupId>

            <artifactId>log4j-slf4j-impl</artifactId>

            <scope>test</scope>

        </dependency>

 

        <!--  monitoring -->

        <dependency>

            <groupId>org.springframework.boot</groupId>

            <artifactId>spring-boot-starter-remote-shell</artifactId>

        </dependency>

 

        <dependency>

            <groupId>org.jolokia</groupId>

            <artifactId>jolokia-core</artifactId>

        </dependency>

 

    </dependencies>

    <build>

        <plugins>

            <plugin>

                <groupId>org.springframework.boot</groupId>

                <artifactId>spring-boot-maven-plugin</artifactId>

            </plugin>

        </plugins>

    </build>

</project>

 

 

3.2          Spring Boot

 

Spring Boot (https://projects.spring.io/spring-boot/) makes it easy to create stand-alone, production-grade Spring based Applications that you can "just run". Most Spring Boot applications need very little Spring configuration.

 

Features

  • Create stand-alone Spring applications
  • Embed Tomcat, Jetty or Undertow directly (no need to deploy WAR files)
  • Provide opinionated 'starter' POMs to simplify your Maven configuration
  • Automatically configure Spring whenever possible
  • Provide production-ready features such as metrics, health checks and externalized configuration

 

3.2.1       Spring Boot applicationContext.xml

We use the applicationContext.xml to define the java beans used by our application. Here we define the beans for connecting to Box, Database connectivity, ActiveMQ and Camel. For the purpose of this application we only need ActiveMQ and Box connectivity.

 

<?xml version="1.0" encoding="UTF-8"?>

<beans xmlns="http://www.springframework.org/schema/beans" xmlns:jdbc="http://www.springframework.org/schema/jdbc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd

        http://www.springframework.org/schema/jdbc http://www.springframework.org/schema/jdbc/spring-jdbc-3.0.xsd

        http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd">

   

 <!-- Define configuration file application.properties -->

    <bean id="placeholder" class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">

        <property name="locations">

            <list>

                <value>classpath:application.properties</value>

            </list>

        </property>

        <property name="ignoreResourceNotFound" value="false" />

        <property name="searchSystemEnvironment" value="true" />

        <property name="systemPropertiesModeName" value="SYSTEM_PROPERTIES_MODE_OVERRIDE" />

    </bean>

   

    <!--  Bean for Box authentication. Please note you need a Box developer account -->

    <bean id="box" class="org.apache.camel.component.box.BoxComponent">

        <property name="configuration">

            <bean class="org.apache.camel.component.box.BoxConfiguration">

                <property name="userName" value="${box.userName}" />

                <property name="userPassword" value="${box.userPassword}" />

                <property name="clientId" value="${box.clientId}" />

                <property name="clientSecret" value="${box.clientSecret}" />

            </bean>

        </property>

    </bean>

 

    <!-- Define database connectivity -->

    <bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">

        <property name="driverClassName" value="org.postgresql.Driver" />

        <property name="url" value="jdbc:postgresql://localhost:5432/alfresco" />

        <property name="username" value="alfresco" />

        <property name="password" value="admin" />

    </bean>

   

    <!-- Configure the Camel SQL component to use the JDBC data source -->

    <bean id="sql" class="org.apache.camel.component.sql.SqlComponent">

        <property name="dataSource" ref="dataSource" />

    </bean>

   

    <!-- Create a connection to ActiveMQ -->

    <bean id="jmsConnectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory">

        <property name="brokerURL" value="tcp://localhost:61616" />

    </bean>

   

    <!-- Create Camel context -->

    <camelContext id="camelContext" xmlns="http://camel.apache.org/schema/spring" autoStartup="true">

        <routeBuilder ref="myRouteBuilder" />

    </camelContext>

   

    <!-- Bean defining Camel routes -->

    <bean id="myRouteBuilder" class="support.alfresco.Route" />

</beans>

 

3.2.2       Application.java

The Application class is used to run our Spring application

 

package support.alfresco;

import org.springframework.boot.SpringApplication;

import org.springframework.context.annotation.ImportResource;

 

@ImportResource("applicationContext.xml")

public class Application {

                    public static void main(String[] args) {

                                        SpringApplication.run(Application.class, args);

                    }

}

 

3.2.3       Route.java

In the Route.java file we define the Camel routes to send traffic from Alfresco to Box.

The code below shows the routes to Execute cmis query, download content and properties, compress it and upload it to Box

 

                                   //////////////////////////////////////

                                        // Download Alfresco documents  //

                                        //////////////////////////////////////

                                        from("jms:alfresco.downloadNodes")

                                        .log("Running query: ${body}")

                                        .setHeader("CamelCMISRetrieveContent", constant(true))

                                        .to(alfrescoSender + "&queryMode=true")

                                        // Class FileContentProcessor is used to store the files in the filesystem together with the metadata

                                        .process(new FileContentProcessor());

                                       

                                       

                                        ///////////////////////////////////////////////

                                        // Move documents and metadata to Box  //

                                        //////////////////////////////////////////////

                                        from("file:/tmp/downloads?antInclude=*")

                                        .marshal().zipFile()

                                        .to("file:/tmp/box");

                                       

                                        from("file:/tmp/metadata?antInclude=*")

                                        .marshal().zipFile()

                                        .to("file:/tmp/box");

                                       

                                        from("file:/tmp/box?noop=false&recursive=true&delete=true")

                                        .to("box://files/uploadFile?inBody=fileUploadRequest");

 

Let’s break it down…

 

1. We read requests messages with a CMIS query from an ActiveMQ queue

from("jms:alfresco.downloadNodes")

 

For example a CMIS query to get the nodes on a specific folder looks like…

SELECT * FROM cmis:document WHERE IN_FOLDER ('workspace://SpacesStore/56c5bc2e-ea5c-4f6a-b817-32f35a7bb195') and cmis:objectTypeId='cmis:document'

 

 For testing purposes we can fire the message requests directly from the ActiveMQ admin UI (http://127.0.0.1:8161/admin/

 

 

2. We send the CMIS query to Alfresco defined as “alfrescoSender”

.to(alfrescoSender + "&queryMode=true")

 

3. Alfresco sender is defined in application.properties as

 

and mapped to “alfrescoSender” variable in Route.java

public static String alfrescoSender;

@Value("${alfresco.sender}")

public void setAlfrescoSender(String inSender) {

        alfrescoSender = inSender;

}

   

4. We store the files retrieved by the CMIS query in the filesystem using class FileContentProcessor for that job

.process(new FileContentProcessor());

 

5. Zip the content file and the metadata file 

from("file:/tmp/downloads?antInclude=*")

.marshal().zipFile()

.to("file:/tmp/box");

                                       

from("file:/tmp/metadata?antInclude=*")

.marshal().zipFile()

.to("file:/tmp/box");

 

6. And finally upload the content to Box 

from("file:/tmp/box?noop=false&recursive=true&delete=true")

.to("box://files/uploadFile?inBody=fileUploadRequest");

 

 

 4.    Building and Running the application

To build the application using maven we execute the following command: 

mvn clean install

 

To run the application execute the following command:

mvn spring-boot:run

5.    Monitoring with Hawtio

Hawtio (http://hawt.io) is a pluggable management console for Java stuff which supports any kind of JVM, any kind of container (Tomcat, Jetty, Karaf, JBoss, Fuse Fabric, etc), and any kind of Java technology and middleware.

Hawtion can help you to visualize Routes with real-time updates on messages metrics.

 

 

You can get statistical data for each individual route.

 

 

I hope this basic introduction to EIP and Apache Camel gives you some idea on how to integrate different applications using the existing end points provided by Apache Camel.

If you need to create a custom property files and you want to use this custom property file in  Process Services you can do it using a Java Delegate or a Spring bean.

 

In my case, I used a Java Delegate.

 

First of all, create your custom property file and upload it to the following folder:

 

/alfresco/process-services-1.6.0/tomcat/webapps/activiti-app/WEB-INF/classes/

 

Now it is time to create a java delegate: But first of all it is important to understand what is the command to load the file properties:

 

It is just one line of code and it's the following line:

InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("generic.properties");

With this code of line, you tell the java delegate which property file to load.

After you loaded it you can start to save your properties inside the property file and they will be used in Process Services.

String host = "";

String username = "";

String password = "";

Properties properties = new Properties();

properties.load(inputStream);

host = properties.getProperty("generic.host");

username   = properties.getProperty("generic.username");

password = properties.getProperty("generic.password");

execution.setVariable("host", host);

execution.setVariable("username", username);

execution.setVariable("password", password);

Let's explain a bit the code

 

So first as the best practice, we create some string and we set the value to empty. Otherwise you can have a null pointer exception.

 

After that, we load inside the Properties object our custom file property

 

Then we start to set our string with the value of the property that is set inside our file property.

 

In our example, generic.host will have some value inside the property file.

 

After we have assigned our string with all the value that we need from the property file we can set them as a variable 

 

execution.setVariable("host", host);

 

So this piece of code means that you have inside your process a variable called host and you are assigning this variable to the string value of the host entry inside the property file (generic.host)

 

After you finish the code it is time to export it as a jar and add the jar inside the following path:

 

/alfresco/process-services-1.6.0/tomcat/webapps/activiti-app/WEB-INF/lib/

 

To apply your new java delegate you need to create a service task and apply under the class field  the name of your class. Below are few screenshots how to do this:

 

1. A very simple example workflow

the process

 

 

2. Configure the global variable:

 

3. The variables set in step 2.

 

 

4. Setup the java delegate (adding the class name to the "class" field:

 

 

I created 2 forms for testing that my Java delegate is loaded correctly.

 

If the host variable will be empty the process should go one way and if it is correct it should go to the form that will display text with all our information of the file properties 

 

How to configure this:

 

1. Create the condition on one arrow :

 

Condition on arrow

 

2. Enter the condition (host: not empty) that will allow to the form success:

 

 

3. And the second condition:

 

 

 

If the host is not empty the process should go to the human task with the following form:

 

 

The reference form here is "success" and this is how it looks like:

 

 

Otherwise, if there is some problem from the java delegate it will go to the other form called "wrong":

 

 

 

For testing you can use the following snippets:

 

Java code:

package my.Test;

import java.io.InputStream;
import java.util.Properties;

import org.activiti.engine.delegate.DelegateExecution;
import org.activiti.engine.delegate.JavaDelegate;


public class GenericCustomProperties implements JavaDelegate {
String host = "";
String username = "";
String password = "";

   



    public void execute(DelegateExecution execution) throws Exception {
         InputStream inputStream = this.getClass().getClassLoader()
         .getResourceAsStream("generic.properties");
          Properties properties = new Properties();
          properties.load(inputStream);
          host = properties.getProperty("generic.host");
          username   = properties.getProperty("generic.username");
          password = properties.getProperty("generic.password");
          execution.setVariable("host", host);
          execution.setVariable("username", username);
          execution.setVariable("password", password);

         

    }


}

Please be sure to create a java project with a package called my.Test create containing a class called

GenericCustomProperties.java

 

Now that you created this java class export it as jar and upload at the following path: /alfresco/process-services/tomcat/webapps/activiti-app/WEB-INF/lib/

 

As custom properties, you can create a file called generic.properties, open it and write the following snippet:

generic.host=127.0.0.1:8080
generic.username=username
generic.password=password

 

The file needs be located inside the following path:

 

/alfresco/process-services-1.6.0/tomcat/webapps/activiti-app/WEB-INF/classes/

Load balancing a network protocol is something quite common nowadays. There are loads of ways to do that for HTTP for instance, and generally speaking all "single flow" protocols can be load-balanced quite easily. However, some protocols are not as simple as HTTP and require several connections. This is exactly what is FTP.

 

Reminder: FTP modes

Let's take a deeper look at the FTP protocol, in order to better understand how we can load-balance it. In order for an FTP client to work properly, two connections must be opened between the client and the server:

  • A control connection
  • A data connection

The control connection is initiated by the FTP client to the TCP port 21 on the server. On the other end, the data connection can be created in different ways. The first way is the through an "active" FTP session. In this mode the client sends a "PORT" command which randomly opens one of its network port and instruct the server to connect to it using port 20 as source port. This mode is usually discouraged or even server configuration prevent it for security reasons (the server initiate the data connection to the client). The second FTP mode is the "passive" mode. When using the passive mode a client sends a "PASV" command to the server. As a response the server opens a TCP port and sends the number and IP address as part of the PASV response so the client knows what socket to use. Modern FTP clients usually use this mode first if supported by the server. There is a third mode which is the "extended opassive" mode. It is very similar to the "passive" mode but the client sends an "EPSV" command (instead of "PASV") and the server respond with only the number of the TCP port that has been chosen for data connection (without sending the IP address).

 

Load balancing concepts

So now that we know how FTP works we also know that load-balancing FTP requires balancing both the control connections and the data connections. The load balancer must also make sure that data connections are sent the right backend server, the one which replied to the client command.

 

Alfresco configuration

From your ECM side, there is not much to do but there are some pre-requisites:

  • Alfresco nodes must belong to the same (working) cluster
  • Alfresco nodes must be reachable from the load balancer on the FTP ports
  • No FTP related properties should have been persisted in database

The Alfresco configuration presented bellow is valid for both load balancing method presented later. Technically every bit of this Alfresco configuration is not required, depending on the method you choose, but applying the config as shown will work on both cases.

First of all you should prefer setting FTP options in the alfresco-global.properties file as Alfresco cluster nodes have different settings, which you may not set using either the admin-console or the JMX interface.

If you have already set FTP parameters using JMX (or the admin-console), those parameters are persisted in the database and need to be remove from there (using the "revert" action in JMX for example).

Add the following to your alfresco-global.properties and restart Alfresco:

 

### FTP Server Configuration ###
ftp.enabled=true
ftp.port=2121
ftp.dataPortFrom=20000
ftp.dataPortTo=20009

 

ftp.dataPortFrom and ftp.dataPortTo properties need to be different on all servers. So if there were 2 Alfresco nodes alf1 and alf2, the properties for alf2 could be:

ftp.dataPortFrom=20010
ftp.dataPortTo=20019

 

Load balancing with LVS/Keepalived

 

Keepalived is a Linux based load-balancing system. It wraps the IPVS (also called LVS) software stack from the Linux-HA project and offer additional features like backend monitoring and VRRP redundancy. The schema bellow shows how Keepalived proceed with FTP load-balancing. It tracks control connection on port 21 and dynamically handles the data connections using a Linux kernel module called "ip_vs_ftp" which inspect the control connection in order to be aware of the port that will be used to open the data connection.

 

 

Configuration steps are quite simple.

 

First install the software:

sudo apt-get install keepalived

Then create a configuration file using the sample:

sudo cp /usr/share/doc/keepalived/samples/keepalived.conf.sample /etc/keepalived/keepalived.conf

Edit the newly created file in order to add a new virtual server and the associated backend servers: virtual_server

 

192.168.0.39 21 {

    delay_loop 6

    lb_algo rr

    lb_kind NAT

    protocol TCP

    real_server 10.1.2.101 2121 {

        weight 1

        TCP_CHECK {

            connect_port 2121

            connect_timeout 3

        }

    }

    real_server 10.1.2.102 2121 {

        weight 1

        TCP_CHECK {

            connect_port 2121

            connect_timeout 3

        }

    }

}

In a production environment you will most certainly want to use an additional VRRP instance to ensure a highly available load balancer. Please refer to the Keepalived documentation in order to set that up or just use the example given in the distribution files.

The example above defines a virtual server that listen on socket 192.168.0.39:21. Connections sent to this socket are redirected to backend servers using round-robin algorithm (others are available) and after masquerading source IP address. Additionally we need to load the FTP helper in order to track FTP data connections:

 

echo 'ip_vs_ftp' >> /etc/modules

It is important to note that this setup leverage the ftp kernel helper which reads the content of FTP frames. This means that it doesn't work when FTP is secured using SSL/TLS

 

Secure FTP load-balancing

 

Before you go any further:

 

This method has a huge advantage: it can handle FTPs (SSL/TLS). However, it also have a big disadvantage: it doesn't work when the load balancer behaves as a NAT gateway (which is basically what HAProxy does).
This is mainly because at the moment Alfresco doesn't comply with the necessary pre-requisites for secure FTP to work.

 

Some FTP clients may work even with this limitation. It may happen to work if server is using ipv6 or for clients using the "Extended Passive Mode" on ipv4 (which is normally used for ipv6 only). To better understand how, please see FTP client and passive session behing a NAT.

 

This means that what's bellow will mainly only work with macOSX ftp command line and probably no other FTP client!

Don't spend time on it and use previous method if you need other FTP clients or if you have no control over what FTP client your users have.

 

Load balancing with HAProxy

 

This method can also be adapted to Keepalived using iptables mangling and "fwmark" (see Keepalived secure FTP), but you should only need it if you are bound to FTPs as normal FTP is much better handled by previous method.

HAProxy is a modern and widely used load balancer. It provides similar features as Keepalived and much more. Nevertheless HAProxy is not able to track data connections as related to the global FTP session. For this reason we have to trick the FTP protocol in order to provide connection consistency within the session. Basically we will split the load balancing in several parts:

  • control connection load-balancing
  • data connection load balancing or each backend server

So if we have 2 backend servers - as shown in the schema bellow - we will create 3 load balancing connection pools (let's called it like this for now).

 

 

First install the software:

sudo apt-get install haproxy

HAProxy has the notion of "frontends" and "backends". Frontends allow to define specific sockets (or set of sockets) each of which can be linked to different backends. So we can use the configuration bellow:

frontend alfControlChannel

    bind *:21

    default_backend alfPool

frontend alf1DataChannel

    bind *:20000-20009

    default_backend alf1

frontend alf2DataChannel

    bind *:20010-20019

    default_backend alf2

backend alfPool

    server alf1 10.1.2.101:2121 check port 2121 inter 20s

    server alf2 10.1.2.102:2121 check port 2121 inter 20s

backend alf1

    server alf1 10.1.2.101:2121 check port 2121 inter 20s

backend alf2

    server alf2 10.1.2.102:2121 check port 2121 inter 20s

 

So in this case the frontend that handle the control connection load-balancing (alfControlChannel) alternatively sends requests to all backend server (alfPool). Each server (alf1 & alf2) will negotiate a data transfer socket on a different frontend (alf1DataChannel & alf2DataChannel). Each of this frontend will only forward data connection to the only corresponding backend (alf1 or alf2), thus making the load balancing sticky. And... job done!

About three years ago, Alfresco created a new branch of our support and services organization, the Premier Services team.  Since then, the Premier Services team has grown into a first class global support organization, handling many of Alfresco's largest and most complex strategic accounts.  Our group consists of some of Alfresco's most seasoned and senior support staff and has a presence in APAC, EMEA and the US serving customers worldwide.  Today we start a new chapter in our journey.

 

One of the benefits of working with large accounts is the breadth and depth of problems we get to help them solve.  Premier Services accounts tend to be those with extensive integrations, demanding uptime, reliability and performance requirements, complex business environments and product extensions.  We are launching our blog to share best practices, interesting insights, code / configuration examples and problem solving tips that arise from our work.  This blog will also serve as a platform for sharing new service offerings and updates to existing offerings, and to give our customers and community members some insights into the direction that our service offerings will take as they evolve.  For our inaugural blog post, we'd like to talk about three recent changes to the Alfresco Premier Services offerings which we think will help reduce confusion about what we deliver and give our customers some extra value.  

 

First, you may have noticed that our web site has been updated as a part of Alfresco's Digital Business Platform launch.  As a part of this update we have started the process of merging two of our premier services offerings.  Previously we offered an On-site Services Engineer (OSE) and a Remote Services Engineer (RSE).  Going forward we are combining these into a single Premier Services Engineer (PSE) offering.  We can still deliver it on-site or remote, and the pricing has not changed.  Where there were differences in the services, we've taken the more generous option and made it the default.  For example, OSE and RSE service used to come with a different number of Alfresco University Passports.  Going forward all PSE customers will get five passports included with their service.  Future passports issued to Premier Services accounts will also include a certification voucher.

 

The other changes in our service are additions intended to help with a pair of common customer requests.  It is common for customers to start to staff up at the beginning of an Alfresco project, and we are often asked to help evaluate potential hires.  In support of this request we have created a set of hiring profiles that identify the core skills that we find will enable a new hire to come up to speed quickly on the Alfresco platform.  These hiring profiles and assessments are available now to all Premier Services accounts.  A second major change is the addition of Alfresco Developer Support to Premier Services accounts on the Alfresco Digital Business Platform.  What this means is that if you are a Premier Services customer and are using Alfresco Content Services 5.2+ AND Alfresco Process Services powered by Activiti 1.6+, you will get Developer Support for two of your support contacts included with your service at no additional cost.  This is a huge addition to the Premier Services portfolio, and we're excited to be able to offer it in conjunction with our peers on the dev support team.

 

Stay tuned for more from the Premier Services team, our next blog posts will take an in-depth look at some challenging technical issues.