Read more about the changes and new features introduced with Solr 6 here.
In this post we will share more information about setting up an Alfresco Enterprise/Solr sharded search index. If you haven't already, see this post for more info on installing Solr 6.
When an index grows too large to be stored on a single search server it can be distributed across multiple search servers. This is known as sharding. The distributed/sharded index can then be searched using Alfresco/Solr's distributed search capabilities. Alfresco/Solr has several different methods to choose from for routing documents and ACL's to shards.
In this post we will focus on the out-of-the-box approach which is sharding by Node ID. In Alfresco/Solr's configuration this is referred to as DBID sharding, as the DBID field is used to hold the Node ID in the Solr index. With DBID sharding, documents are routed to shards based on a hash of the Node ID of the document. Hashing on the Node ID is a simple approach that relies on randomness to evenly distribute documents across the shards.
When using the DBID sharding approach, all ACLs are indexed on each shard. This ensures that the ACL for each node is co-located on the same shard. This is required for proper access control enforcement. The DBID sharding method is ideal for use cases where there are a large number of nodes, but a smaller number of ACLs.
Follow the steps described below to complete the sharding setup and test.
For each Solr install:
At the same level as the solrhome directory there will be a solr directory.
Enter the solr directory and enter the following command:
./bin/solr start
This will start solr on the default port (8983).
To start Solr on a different port enter the command:
./bin/solr start -p PORT_NUMBER
Replace PORT_NUMBER with the port you will be starting Solr on.
Open a browser and go to the solr admin screen:
http://hostnameort/solr
You will see a Solr 6 admin screen without any cores created
This will create Solr cores for each shard and shard replica on the index servers that have been registered. The cluster is now created and will began tracking the Alfresco repository and indexing documents and ACLs across the sharded index.
Please let us know how you get on, leave a comment or email harry.peek@alfresco.com.
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.