AnsweredAssumed Answered

Full reindex 100 completed but is not really.

Question asked by rchamy on Apr 26, 2010
Latest reply on Aug 20, 2010 by sergiofigueras
Good afternoon (sorry for my english).

I'm a newbie for lucene. I'm working with Alfresco Labs 3.0 (and Lucene 2.1.0). During last months I noticed that the indexes were incomplete because I searched some documents and I had no results when I'm sure that this doc was uploaded successfully (vía ACP) and the database shows me the information and in other nodes (I have 3 nodes) is showed. I need to do a Full Reindex to recover it and solve this problem. This reindex spends about 10h with 860.00 transactions. Once the recovery was completed I used Luke ToolBox to view the information and I noticed that its also incomplete (because I do searches and some documents were not viewed by lucene when they exists in other nodes) and the index recovery log shows me that was successful at 100% completed.

I made another Full Reindex in the same node (deleting the folder lucene_index before) and when it was completed I noticed that I can find the documents that I couldn't see before but now I have other documents that I can't search but I can in the first full reindex!!!. If I do another full reindex I will see some documents that I couldn't before but new documents will be lost and so on. I'm lost because I'm not able to find any coherency about why some documents are showed and why others not and vice versa doing rebuilds but not always the same documents. Mi documents are in Spanish Language.

Any suggestion will be appreciate. My lucene configuration is:

index.recovery.mode=AUTO
# index.recovery.mode=FULL
# Parametros anyadidos durante la instalacion
index.recovery.stopOnError=false
index.recovery.maximumPoolSize=5
index.tracking.cronExpression=0/5 * * * * ?
index.tracking.adm.cronExpression=${index.tracking .cronExpression}
index.tracking.avm.cronExpression=${index.tracking .cronExpression}
index.tracking.maxTxnDurationMinutes=10
index.tracking.reindexLagMs=10000
index.tracking.maxRecordSetSize=1000
index.tracking.maxTransactionsPerLuceneCommit=100
index.tracking.disableInTransactionIndexing=false
lucene.indexer.batchSize=1000
lucene.indexer.mergeFactor=10
lucene.query.maxClauses=10000
lucene.indexer.maxMergeDocs=100000
lucene.indexer.minMergeDocs=1000
#lucene.indexer.maxFieldLength=0 –> We don't want to index the content of the doc, only metadata send in a XML inside the ACP file (The ACP contain a PDF and a XML file).

Many thanks.

Regards.

Outcomes