AnsweredAssumed Answered

Indexing Issue| Email (application/vnd.ms-outlook - msg) with Attachment | Uploaded Via Explorer

Question asked by milochanzy on Feb 26, 2015
Latest reply on Feb 27, 2015 by milochanzy
Hi,

I'm struggling to get one thing sorted out in the Alfresco vanilla version 4.2.1.7. The context is, I'm trying to upload via "explorer" an email which has attachments to it. Now I'm able to see that indexing is happening on the email (application/vnd.ms-outlook - msg) content and metadata however the attachment content is not getting indexed. The attachment is simple word (2010) document (but it could be excel or pdf or text).

I browsed through the implementation and I see Tika parse OutlookExtractor is used to parse the email and attachments in the email. The URL (http://localhost:8080/alfresco/service/mimetypes?mimetype=application/vnd.ms-outlook#application/vnd.ms-outlook) gives following:

application/vnd.ms-outlook - msg
Extractors: org.alfresco.repo.content.metadata.MailMetadataExtracter
Transformable To:
application/xhtml+xml = org.alfresco.repo.content.transform.MailContentTransformer
text/html = org.alfresco.repo.content.transform.MailContentTransformer
text/plain = org.alfresco.repo.content.transform.MailContentTransformer
text/xml = org.alfresco.repo.content.transform.MailContentTransformer
Transformable From: Cannot be generated from anything else

So ideally the email attachment should get indexed too, right? But its not :|. I would like to get your opinion on how can I fix this issue? Please let me know if you like to see any specific config file I've used. I've not modified anything however from default implementations.

I'm using Tika 1.5 and POI 3.10.

Thanks,
Milan.

Outcomes