Indexing of PDF files and EMail messages

Question asked by marian on Aug 25, 2007
Latest reply on Feb 4, 2008 by fthamura

I have installed alfresco community 2.1 and have been playing with it for
a few days.

Apparently in my installations the bodies of PDF files and those of EMail-Messages
(as exported from MS Outlook in the form of .msg-files) does not work.

I can upload a msg file and the author is filled from the sender of the
message and description is filled from the subject of the message.
The message can then be found using searches for words from these
fields, but not from the body.

The mail shows up in the 'nitf' special-search. So apparently the
transformation failed. There is no indication of such a failure in the log (on
the console as I started alfresco with the batch).

Is indexing of the mail body possible at all? Do I need to configure
something to make it work? What can I configure to debug the failure

The same is true for PDF files: I can upload PDFs and title and author are
prepopulated. The document is not returned for searches on content. It is
also not returned from either of nitf, nicm or nint searches.

My document is very simple and small, contains only simple text and has
been created from MS word through a ghostcript-based PDF-Printer. The
word-DOC itself is correctly indexed.

Any ideas on how to debug this issue or pointers to further reading on the
system are very much appreciated.

Ciao, MM