Hi,
Need a help.How to extract the data(i.e name and date of birth) in scanned PDF files and convert into the single based on name as well as date of birth using alfresco bulk import.I am using the Alfresco 5.0 version.
Solved! Go to Solution.
You cannot do this with Alfresco out-of-the-box without additional third-party tools. You need to OCR the scanned image so that it will be converted into text and you need to use one or more "zone" features to read the text from a standard area in the document into the name and date-of-birth properties.
If your PDF has already been converted into text and your OCR software does not have the ability to read from a zone, then I suppose you could write your own code that would parse the text to extract the name and date-of-birth.
There are several people in the community who have integrated various OCR solutions with Alfresco, so you should be able to find something that will work for you.
You cannot do this with Alfresco out-of-the-box without additional third-party tools. You need to OCR the scanned image so that it will be converted into text and you need to use one or more "zone" features to read the text from a standard area in the document into the name and date-of-birth properties.
If your PDF has already been converted into text and your OCR software does not have the ability to read from a zone, then I suppose you could write your own code that would parse the text to extract the name and date-of-birth.
There are several people in the community who have integrated various OCR solutions with Alfresco, so you should be able to find something that will work for you.
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.