Alfresco under load not able to give new tickets or log off users?

cancel
Showing results for 
Search instead for 
Did you mean: 
darminm
Active Member

Alfresco under load not able to give new tickets or log off users?

Hello - we have an issue that once we roughly go above 100 users we start seeing below errors and our application is no longer able to authenticate against alfresco for any api calls, new tickets or logoff events.


2017-01-19 03:49:28,254 ERROR [org.alfresco.util.transaction.TransactionSupportUtil] After completion (committed) TransactionalCache exception
org.alfresco.error.AlfrescoRuntimeException: 00191108829 Failed to transfer updates to shared cache

2017-01-19 04:59:02,914 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] Exception from executeScript - redirecting to status template error: [CONCURRENT_MAP_PUT] Redo threshold[90] exceeded! Last redo cause: REDO_MAP_OVER_CAPACITY, Name: c:cache.ticketsCache
com.hazelcast.core.OperationTimeoutException: [CONCURRENT_MAP_PUT] Redo threshold[90] exceeded! Last redo cause: REDO_MAP_OVER_CAPACITY, Name: c:cache.ticketsCache

4 Replies
janv
Alfresco Employee

Re: Alfresco under load not able to give new tickets or log off users?

Please give more detail about your environment (including exact build versions of Alfresco, Database, O/S etc). Is this using Alfresco Share with Alfresco Platform ?

Is this Community or an Enterprise cluster ? See also [ACE-5184] Tomcat 7 classloader serializes authentication ticket retrieval - Alfresco JIRA  (to see if there is any correlation).

Thanks,

Jan

darminm
Active Member

Re: Alfresco under load not able to give new tickets or log off users?

Hi Jan,

We use Alfresco 5.0.3.1 enterprise version.

It runs on Windows Server 2012 R2.

We have two Alfresco nodes running in the cluster.

We do not use Alfresco Share but a third party user interface.

We have also logged a ticket for this with Alfresco support wanted to see if the community also had this issue.

Thanks

janv
Alfresco Employee

Re: Alfresco under load not able to give new tickets or log off users?

Thanks for the details. In addition to your support ticket (and any community feedback), please take a look at ACE-5184 in case there is any correlation.

Regards,

Jan

afaust
Master

Re: Alfresco under load not able to give new tickets or log off users?

Unfortunately the Hazelcast cache for tickets (as most other caches) has been configured to use synchronous replication of data to other cluster nodes. This can cause various issues to propagate over multiple members of the cluster. E.g. when a cluster node is suffering from excessive GC overhead this might introduce significant delays to trigger timeouts in the communication and can even cause the cluster to dissolve in the worst cases.

In your case it would be interesting to get more information out of the Hazelcast layer at the time of these errors and why redos have to be performed. I was previously able to analyze internal issues with (the older version of) Hazelcast is by setting the appropriate logger (com.hazelcast) to DEBUG via the Alfresco Support Tools addon. These can then be passed on to Alfresco Support / used here for additional analysis.