Re: Alfresco under load not able to give new ticke...

darminm · ‎19 Jan 2017

Hello - we have an issue that once we roughly go above 100 users we start seeing below errors and our application is no longer able to authenticate against alfresco for any api calls, new tickets or logoff events.

2017-01-19 03:49:28,254 ERROR [org.alfresco.util.transaction.TransactionSupportUtil] After completion (committed) TransactionalCache exception
org.alfresco.error.AlfrescoRuntimeException: 00191108829 Failed to transfer updates to shared cache

2017-01-19 04:59:02,914 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] Exception from executeScript - redirecting to status template error: [CONCURRENT_MAP_PUT] Redo threshold[90] exceeded! Last redo cause: REDO_MAP_OVER_CAPACITY, Name: c:cache.ticketsCache
com.hazelcast.core.OperationTimeoutException: [CONCURRENT_MAP_PUT] Redo threshold[90] exceeded! Last redo cause: REDO_MAP_OVER_CAPACITY, Name: c:cache.ticketsCache

janv · ‎19 Jan 2017

Please give more detail about your environment (including exact build versions of Alfresco, Database, O/S etc). Is this using Alfresco Share with Alfresco Platform ?

Is this Community or an Enterprise cluster ? See also [ACE-5184] Tomcat 7 classloader serializes authentication ticket retrieval - Alfresco JIRA (to see if there is any correlation).

Thanks,

Jan

darminm · ‎19 Jan 2017

Hi Jan,

We use Alfresco 5.0.3.1 enterprise version.

It runs on Windows Server 2012 R2.

We have two Alfresco nodes running in the cluster.

We do not use Alfresco Share but a third party user interface.

We have also logged a ticket for this with Alfresco support wanted to see if the community also had this issue.

Thanks

janv · ‎19 Jan 2017

Thanks for the details. In addition to your support ticket (and any community feedback), please take a look at ACE-5184 in case there is any correlation.

Regards,

Jan

afaust · ‎19 Jan 2017

Unfortunately the Hazelcast cache for tickets (as most other caches) has been configured to use synchronous replication of data to other cluster nodes. This can cause various issues to propagate over multiple members of the cluster. E.g. when a cluster node is suffering from excessive GC overhead this might introduce significant delays to trigger timeouts in the communication and can even cause the cluster to dissolve in the worst cases.

In your case it would be interesting to get more information out of the Hazelcast layer at the time of these errors and why redos have to be performed. I was previously able to analyze internal issues with (the older version of) Hazelcast is by setting the appropriate logger (com.hazelcast) to DEBUG via the Alfresco Support Tools addon. These can then be passed on to Alfresco Support / used here for additional analysis.

Alfresco under load not able to give new tickets or log off users?

Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Alfresco under load not able to give new tickets or log off users?

Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

Re: Alfresco under load not able to give new tickets or log off users?

We use cookies on this site to enhance your user experience