AnsweredAssumed Answered

Alfresco + XML + Lucene

Question asked by egabbud on Aug 10, 2006
Latest reply on Sep 17, 2007 by andy
Hello!

Alfresco is a great product, but it certainly loose a lot of advantages when indexing XML as simple text only.

We need to be able to store XML documents and perform a search in all documents where search terms appear in a given XML node. We don't need to have a complete XPath but just the parent node. E.g. search toto@footer where footer is an XML node of our document.

The complete XML structure is not known in advance and will certainly evolve with time.

We think this could be easily done at indexing time by dynamically adding a Lucene field for each XML tag encoutered, like Alfresco does for meta-data indexing.

The question: how to implement this in the nicest way? What Alfresco classes should be overwritten? It's quite difficult to understand now how alfresco interacts with Lucene.

Thanks for any advice!

P.S. The problematic is quite similar to http://forums.alfresco.com/viewtopic.php?t=277, but simplier in our case. Is Alfresco going to do something for XML documents indexing?

Outcomes