Out Of Memory during Bulk Upload via CMIS: Memory Leak?

Question asked by ad-int-en on Jun 1, 2011
Latest reply on Jun 20, 2011 by rmacian

we are running Alfresco 3.4.d 64 Bit on different platforms (CentOS 5.6 and Windows 2008 Server R2) and perform some mass-import tests to test the stability of the Alfresco Platform. So far results do not look very promising.
On CentOS we ran into either Out Of Memory Exceptions after trasnfering about 5000 files via CMIS with 5-10 Threads (including setting of some aspects and folder creation via CMIS) or after about 10.000 files via FTP with 10 parallel upload Threads.

Transfering with FTP looks more promissing so far, on the CentOS Platform we where able to transfer about 10k files but then the FTP Service of Alfresco seemed to quit and did not react anymore, only a restart of alfresco could fix the problem.
On Windows with the standard configuration of alfresco 3.4.d community so far we where able to transfer 30k+ Files via 10 ftp threads without any issues, test is still running though.
A test with CMIS on Windows was also failing with an Out Of Memory Exception just like on CentOS.

When creating a Heap Dump we saw that lucene was eating up most of the memory working on the different indexes, which in general is ok, but the fact that we always run into OOM no mather how much memory we give to the JVM looks like there is a memory leak somewhere in the whole implementation (especially when using the CMIS Interface for some reason). The Heap Memory graph always looks pretty much the same, always like a jigsaw but with the amount of memory that is freed up by the GC getting less and less and finaly resutling in the GC running all the time to free up just a minimum amount of memory and then leads to the OOM deatch of the JVM.

At the moment we are running a 260.000 file Upload via FTP which ist the biggest batch we have to be able to import into the DMS, if this works via FTP we have at least found a way to import the data, but in the production we have to use the CMIS interface for smaller but steady batches 24/7 and with the current configuration and the possible memory leak this is porbably not going to work.

On CentOs we set the number of open files to 32k (using ulimit -n 32000) and tried different memory settings (standard configuration with Xmx768m up to tweaked memory settings and gc setting and with up to Xmx5G).

Any hints on how to configure Alfresco properly especially for bulk imports via CMIS would be highly appreciated. We have 8 GB RAM , 8CPU  cores and 64 Bit System (at the moment running on WIndows 2008 Server since it looked more stable in the tests so far).
We followed the hints from the alfresco presentation "Scale Your Alfresco Solutions" by Alfresco Product Manager Mike Farman but without success as well as all other hints we could find on the net, but if there is in fact a memory leak all the JVM settings will not help at all.

Thx for you help in advance