AnsweredAssumed Answered

Auto population from OCR document in to Content Model

Question asked by chaitanya on Apr 24, 2012
Latest reply on Aug 1, 2012 by wmay

We have built a small application using Alfresco to abstract legal documents.  To capture data after abstraction( on a page that contains text boxes, text area etc) we have written our own content model and services for the business logic. The model and the services are incorporated in to Alfresco core. No work-flow is created but logic is built around changing properties of documents. All properties are defined in the content model.

Now we want to integrate this with a OCR tool. We want Alfresco to pick up the OCR document and batch them based on certain input criteria (similar to a query), and also auto populate some of the contents from the OCR document(unstructured) in to the pages created using content model.

I want to understand if this is possible (batching, auto-population) in Alfresco,  and if someone has achieved this, please share your experience on the accuracy of data that has been auto-populated and the how successful this implementation has been especially when reading documents (OCRed pdf, tiff) that are unstructured.