Using AWS OpenSearch with Alfresco Search Enterprise 3.1

cancel
Showing results for 
Search instead for 
Did you mean: 

Using AWS OpenSearch with Alfresco Search Enterprise 3.1

angelborroy
Alfresco Employee
4 1 5,187

WeeMeng Chong, Snr Partner Solution Architect, Partner Core Horizontals, AWS

Angel Borroy, Developer Evangelist, Hyland

 

When Hyland released Alfresco Content Services 7.1 and Alfresco Enterprise Search 3.0, customers were able to harness the speed and scalability of searching afforded by Elasticsearch on their Alfresco repository. However, to get the benefit of these two powerful open source technologies, they’ll need to provide their own Elasticsearch server.

Amazon OpenSearch Service is the AWS-managed service that lets you run OpenSearch clusters without having to worry about managing, monitoring, and maintaining your infrastructure, or having to build in-depth expertise in operating OpenSearch clusters. Amazon OpenSearch can be used as the Elasticsearch server for Alfresco Search Enterprise.

This post describes a simplistic deployment to show the ease in which Amazon OpenSearch is used with Alfresco Content Services (ACS) and Alfresco Search Enterprise. Production grade deployments that require high availability and durability can be guided by this post. Please contact your AWS and/or Hyland account team for assistance in designing a production grade deployment.

 

Overview of solution

Alfresco Content Services deployed with Docker Compose. Amazon services used are:

Figure 1 - AWS architectureFigure 1 - AWS architecture

 

Walkthrough

There are several ways to deploy an ACS system. This post uses the Docker Compose method and the concepts described is applicable to any of the other deployment methods. CloudFormation is used to deploy most of the AWS services used in this post.

This is a hands-on walkthrough; therefore, this post assumes familiarity with Git, containerization with Docker, ACS administration and general AWS cloud concepts, specifically on CloudFormation, EC2, VPC, KMS, and RDS.

High-level steps:

  1. Create a KMS key for the region you’ll be deploying in.
  2. Deploy AWS services with provided CloudFormation template, updating resources as suitable for your environment.
  3. Install Docker and Docker compose into the EC2 Alfresco server.
  4. Clone ACS with Amazon OpenSearch project from GitHub.
  5. Create a database for ACS in Amazon RDS.
  6. Configure ACS for your environment
  7. Start ACS.
  8. Ingest documents
  9. Happy searching!

Link to GitHub repo: https://github.com/AlfrescoLabs/alfresco-opensearch-aws

 

Prerequisites

For this walkthrough, you need the following:

 

Detailed steps

1. Create a KMS key

The Amazon OpenSearch domain that uses fine-grained access control requires a KMS key.

Create the KMS key for the region you’ll be deploying in; at minimum, the account used for CloudFormation deployment is specified in the key as an administrator and a user.

2. Deploy AWS services on CloudFormation

This CloudFormation template deploys the following resources for AWS managed services used for this ACS deployment. Values listed below for each service are those that are sensible for this post, values that not listed are subject to your environment.

When specifying a name for the CloudFormation stack, be sure to use only lowercase for letters as the name of the stack is the name is used to form the properties for some of the AWS resources deployed in this stack.

It takes approximately 18 minutes to deploy the stack.

  • One VPC
    • One public subnet
      • One internet gateway
    • One private subnet
    • Two security groups
      • Security group 1 that has no restriction for resources within the same security group
      • Security group 2 that allows incoming TCP traffic on:
        • Port 8888 from your workstation’s public IP
        • Ports 8080 and 22 from any IPv4 address
  • One EC2 Alfresco server
    • M6i instances with at least 8GiB of memory is suggested.
    • Deploy in a public subnet
  • One Amazon MQ broker
    • ActiveMQ engine
    • Simple authentication and authorization
    • Deploy in a private subnet
  • One Postgres server in RDS
    • Password authentication
    • No public access
    • Deploy in a private subnet group
  • One file server in EFS
    • Mount target in public subnet for Alfresco server
  • One Amazon OpenSearch Service domain
    • VPC only access
    • Deploy in a private subnet
    • Enable fine-grained access control
    • Use fine-grained access control for domain access policy
    • Create a master user

After the CloudFormation stack is deployed, switch to the Outputs tab of the CloudFormation management console of your stack. You’ll need the values displayed for the rest of this blog.

Figure 2 - CloudFormation OutputsFigure 2 - CloudFormation Outputs

3. Install Docker and Docker Compose in the EC2 Alfresco server

Docker and Docker Compose is used in this post to simplify the deployment of ACS.

i. Log into the EC2 Alfresco server.


Figure 3 - AWS EC2 consoleFigure 3 - AWS EC2 console

ii. Install Docker by following this guide and Docker Compose standalone by following this guide.

A convenience script to install both is provided, 1-install-docker.sh. Reboot the server after executing the script. If the server is shut down and then restarted, it is highly likely that the public IP of the server will change.

iii. After the reboot, log into Quay.io with your Hyland provided credentials.

A convenience script to install both is provided, 2-login-quay.sh.

4. Clone the Alfresco AWS OpenSearch Git project

i. Log into the EC2 Alfresco server.

ii. Install a Git client.

In the remote terminal session, use yum to install a Git client.

yum install -y git

iii. Clone the Git repository at https://github.com/AlfrescoLabs/alfresco-opensearch-aws into the EC2 Alfresco server.

In the remote terminal session use git to clone to project.

cd ~ && git clone https://github.com/AlfrescoLabs/alfresco-opensearch-aws

5. Create a database for ACS in RDS

The requirements for the database are listed in steps 6 to 8 of Alfresco’s PostgreSQL database on Amazon RDS guide. Use Postgres client you are familiar with to fulfil the requirements.

Information about the database server is obtained through the CloudFormation stack’s Outputs tab (Figure 2). Copy the value listed for DatabaseEndPoint to connect to the database server.

i. Log into the EC2 Alfresco server.

ii. Create a database named alfresco owned by fully privileged alfresco with password alfresco in the created RDS database server.

A convenience script to create the alfresco database is provided, 3-create-database.sh. Pass in the DatabaseEndPoint string from the CloudFormation stack’s output as a parameter to the script.

6. Configure ACS for your environment

Mount the EFS filesystem onto /efs on the Alfresco server. Update the Docker-Compose environmental file with values for your environment. Environmental information about the stack created in CloudFormation is obtained on the CloudFormation stack’s Outputs tab (Figure 2).

i. Log into the EC2 Alfresco server.

ii. Mount the Amazon EFS filesystem.

A convenience script to mount the EFS filesystem is provided, 4-mount-efs.sh. Pass in the FileSystemMount string from the CloudFormation stack’s output as a parameter to the script.

iii. Update the .env file with values for your environment.

Specify endpoint keys for AWS services.

a. POSTGRES_ENDPOINT = DatabaseEndPoint

b. ACTIVEMQ_SERVER = AmazonMQEndPoint

c. ACTIVEMQ_PORT = 61617

d. ELASTICSEARCH_SERVER_NAME = OpenSearchDomainEndpoint

e. TRUSTSTORE_NAME, TRUSTSTORE_PASS, TRUSTSTORE_TYPE

A truststore for AWS eu-west-1 region has been provided in the example. To use this truststore, remove the placeholder name and use the values provided for the sample truststore.

For other regions, create a truststore that contains the public certificates for the region.

i. Place the truststore in the keystores directory.

ii. Update the truststore values in the .env file.

A convenience script to build the truststore is provided, 5-build-truststore.sh. Pass in the OpenSearchDomainEndpoint string from the CloudFormation stack’s output as a parameter to the script.

7. Start ACS

Run the provided Docker Compose file with the recreate option to start Alfresco Content Services.

i. Log into the EC2 Alfresco server.

ii. Ensure that the Alfresco license file provided by Hyland has been placed in the alfresco directory.

iii. Execute Docker Compose to run ACS the alfresco directory.

docker-compose up --build --force-recreate -d

iv. It takes about 20 minutes for the containers to start up and a further 2 minutes for the Alfresco application to start up. Monitor the start up by tailing the docker logs. Wait for the log message containing “Alfresco Content Services started (Enterprise).”

docker-compose logs -f

v. Use a web browser to access the Alfresco repository with the Alfresco Digital Workspace application. The URL is on the Cloudformation stack’s output tab as AlfrescoWorkspaceURL.

Figure 4 - Alfresco Digital Workspace browser clientFigure 4 - Alfresco Digital Workspace browser client

 

Cleaning up

To avoid incurring future charges, delete the CloudFormation stack and schedule the KMS key for deletion.

 

Conclusion

This post walks through the salient points in configuring Alfresco Search Enterprise to use Amazon OpenSearch Service as the provider for Elasticsearch. The architecture of services shown in this post is not meant for production use. Contact your Hyland and/or AWS account representatives for production deployment assistance if needed.

 

Authors' bio

 

Angel Bangel-borroy.pngorroy is Developer Evangelist in Hyland. During the last 20 years, Angel has been participating in different Open-Source communities: producing blog posts, sample projects, video tutorials, speaking in conferences and organizing Community events.

 

 

 

 

 

weemeng-chong.pngWeeMeng Chong is a Senior Partner Solution Architect leading the Business Application segment in Partner Core Horizontals in AWS.
Most of his career was in professional services, consulting for several large enterprise software vendors.

About the Author
Angel Borroy is Hyland Developer Evangelist. Over the last 15 years, he has been working as a software architect on Java, BPM, document management and electronic signatures. He has been working with Alfresco during the last years to customize several implementations in large organizations and to provide add-ons to the Community based on Record Management and Electronic Signature. He writes (sometimes) on his personal blog http://angelborroy.wordpress.com. He is (proud) member of the Order of the Bee.
1 Comment