AnsweredAssumed Answered

transform pdf, return new version

Question asked by abruzzi on Apr 11, 2014
Latest reply on Apr 14, 2014 by romschn
We have an OCR server (ABBYY Recognition Server) that can perform its work over a SOAP connection.  I've built a transformation for PDF to TXT.  Basically, the script that is called by the transformation first looks for a text layer and exports and returns that using pdftotext if available skipping the OCR server.  However if there is no text layer, it uses SOAP and runs it through the OCR server, getting back text which is returned to alfresco for indexing.

That all works great.  However, the OCR server has the ability to return a PDF with the OCR text as a text layer.  What I was hoping is possible is to have a PDF to PDF transformation that would be set as a rule or run manually (with a webscript perhaps?) that sends the PDF out and gets the PDF back, then inserts the new PDF as a new minor version of the original document.

I've done a lot with the 3.1.2 Alfresco Explorer, but Share is very new to me, and our new setup is going into Share, so I'm not sure where to begin.  I can see that the Javascript API has some versioning capability:

http://docs.alfresco.com/4.2/topic/com.alfresco.enterprise.doc/references/API-JS-Versions.html

But I'm not sure where to start with this.  Any suggestions?

Outcomes