How to increase performance of content creation?

Question asked by rajd on Aug 30, 2009
Hello all,

Since some time we're using the Alfresco 3.2 Community Edition to store PDF files. As we have over 2 million (growing by over 10000 each day) PDF files, a fast insert of content is required. On our system (Alfresco in an ESX virtual machine with 9 2.33Gz Xeon cores and 16Gb of memory), we can do about 1000 inserts per hour. That is when using either the CIFS or FTP interface. To make this possible, we did some tuning on Java (8Gb/mem) and MySQL (4Gb mem, lots of caching, as little as possible disk i/o). Unfortunately, 1000 per hour is not enough for us, so we're looking into a solution for speeding this process up (in fact we want CIFS performance near to "real CIFS speed").

From things I've read over the internet, I think it's the indexing process (Lucene?) being the bottleneck. Is there any way to tune this factor as well?

When inserting files, I notice that none of the 8 CPU's ever are consumed for more than 50% (average is probably around 15% per core). Memory usage never exceeds 60%…

All tips are welcome :-)

Best, RajD