AnsweredAssumed Answered

Lucene-based search is slow as well?

Question asked by panokhin on Jun 8, 2006
Latest reply on Jul 24, 2006 by andy
Based on the advice I've received in I've redone search performance testing using the Alfresco's native Lucene-based API.

The model is like this: Site->Buyer->Buyer->Notice->LangProperty (please see for the complete model definition)
There are about 5600 of LangProperties, 5600 of Notices and 800 of Buyers.
The computer has two AMD Opterons 252 and 8Gb memory, connecting to Oracle remotely on the local net.
I'm using Alfresco SDK 1.3 version with minor modifications for Oracle and our custom model.
I ran this query, while logged in as "admin":
"PATH:\"//dg:notices/dg:langProperties\" AND @dg\\:propText:\"loan\""
and results (around 300 objects) were given in between 60 and 70 seconds on different runs.
Similar queries on our own DB schema using Oracle Intermedia with the same hardware setup mostly return in under 8 seconds.

Here are statistics for different queries:

"PATH:\"//dg:notices/dg:langProperties\" AND @dg\\:propText:\"loan\""
Duration: 63463
Size: 331

"PATH:\"//dg:notices/dg:langProperties\" AND @dg\\:propText:\"solicitation\""
Duration: 1013
Size: 4

"PATH:\"//dg:notices/dg:langProperties\" AND @dg\\:propText:\"water\""
Duration: 39908
Size: 218

It looks to me that query time depends directly on the number of objects in the result.
I would also say that 200-300 objects in a result is typical for our system and 1 minute response time is unacceptable for an interactive service.

As far as I understand, statistically, 7-8 seconds is the most time people would wait in front of a monitor.
Based on the above statistics, under 8 seconds search times are achievable only for results with ~50 objects. And I'd guess on average in a big system search result would be bigger than that.

So the question is, do you think there is any way to cut down these search times? And if yes, in what way?