AnsweredAssumed Answered

Metadata Extraction MS Word

Question asked by johnpelquingua on Jan 13, 2014
Latest reply on Jan 16, 2014 by johnpelquingua
Hi All,

How can I customized my alfresco share to extract the following metadata's in an MS Word?


/**
* Office file format Metadata Extracter.  This extracter uses the POI library to extract
* the following:
* <pre>
*   <b>author:</b>             –      cm:author
*   <b>title:</b>              –      cm:title
*   <b>subject:</b>            –      cm:description
*   <b>createDateTime:</b>     –      cm:created
*   <b>lastSaveDateTime:</b>   –      cm:modified
*   <b>comments:</b>
*   <b>editTime:</b>
*   <b>format:</b>
*   <b>keywords:</b>
*   <b>lastAuthor:</b>
*   <b>lastPrinted:</b>
*   <b>osVersion:</b>
*   <b>thumbnail:</b>
*   <b>pageCount:</b>
*   <b>wordCount:</b>


For example I want to extract just the keywords of an MS Word Document what are the steps I should make to accomplish that?

I have follow the steps on this tutorial (http://wiki.alfresco.com/wiki/Metadata_Extraction) but it seems that it doesn't get me anywhere.

Can you please advice..

Your help is very much appreciated.


Best Regards,
JP

Outcomes