AnsweredAssumed Answered

Indexing of PDF custom fields?

Question asked by pascalsartoretti on Jul 13, 2006
Latest reply on May 7, 2007 by amarendrakt
I would like to store scanned images in Alfresco, including some metadata (either recognized by OCR or manual indexing). We usually do this by creating a PDF, containing the metadata in custom fields (e.g. "Customer" => "ACME Inc.").

Problem: Alfresco doesn't seem to index the PDF custom fields… I thought Lucene was guilty, but it seems (?) that it is not Lucene which handles the PDF, but an other converter such as XPDF.

Hence my questions:

1- How could Alfresco also index the metadata? By using an other converter?
2- Do anybody see an other workaround?

I know that I could store the data and metadata in two separate files (TIIF + XML), but it would really much better to have them in a single file.