AnsweredAssumed Answered

Custom Metadata Extracter with custom action

Question asked by ingdade on Dec 18, 2013
Hello,
I have to deal with pdf documents of various types.
I configured my test model with 2 types, each of one has different metadatas.

As my pdf are text file I want to extract custom metadata from pdf, but metedatas are different for each type.

I'm able to write a custom metadate extracter studiyng doc and http://forums.alfresco.com/users/jpotts samples.

The problem is that, as all of the samples, I overwrite the standard pdf extrator and i'm not able to choose which extractor to use.

I noticed that in Alfresco each extractor is associated with a mimetype and is registered in the metadataExtracterRegistry.

It's possible to avoid using auto detect Extracter and specify which extractor to use?

I need more than 1 extractor for PDF mimetype.

I thought that i can write a custom action that invoke a particular extractor so I tried, without succes, to write a custom action duplicating the org.alfresco.repo.action.executer.ContentMetadataExtracter
https://svn.alfresco.com/repos/alfresco-open-mirror/alfresco/COMMUNITYTAGS/V4.2d/root/projects/repository/source/java/org/alfresco/repo/action/executer/ContentMetadataExtracter.java

I think the the extractor is invoked , it read the custom metadatas, but it doesn't write the properties in the model, maybe because the mapping is missing.

I instanciated extractor with a "new" statement changing



MetadataExtracter extracter = metadataExtracterRegistry.getExtracter(mimetype)


in

 
EnhancedPdfExtracter extracter = new EnhancedPdfExtracter();


and than calling extracter.extract(.. ) even if my custom metadata extracter has only extractRaw metod.


Anyone can help me or has any idea to use more than one PDF excracter?

Outcomes