I am planning to do re-indexing of couple of tera bytes data in alfresco.
For that, I am doing re-indexing with smaller set of data and make benchmark on that and tune as deeply it can be.
After measuring performance for smaller set of data, calculate re-indexing time for actual data.
To have benchmark and get some initial idea how much time it will take to re-index, I was calculating number of alfresco documents. ( Is this right approach or should we consider data size or something else ? )
What is exact way to calculate number of docs in alfresco.
1) Check number of raw in alf_node tables with workspace filter.
2) Write java or java-script to measure number total docs with cm:content type.
3) Query in SOLR for cm:content type.
We are using alfresco 5.2.2 enterprise with SOLR 4.