This addon provides a repository action to extract OCR text from images (TIFF) or plain PDFs in Alfresco. Also a share action is exposed in Document Library component.
License The plugin is licensed under the LGPL v3.0.
State Current addon release is 2.3.1
Compatibility The current version has been developed using Alfresco 5.2 and Alfresco SDK 3.0.2, although it should also run in Alfresco 5.1, 5.0 & 4.2 (as it is developed by using Alfresco SDK 3.0)
Browser compatibility 100% supported
Supported OCR software
- pdfsandwich (http://www.tobias-elze.de/pdfsandwich/)
- OCRmyPDF (https://github.com/jbarlow83/OCRmyPDF)
- Windows.Media.OCR (https://www.nuget.org/packages/Microsoft.Windows.Ocr/) as local service
Languages Currently Share interface is provided in English, Spanish and Brazilian Portuguese. OCR supported languages catalog depends directly on selected OCR software (Tesseract OCR or Windows.Media.OCR) No original Alfresco resources have been overwritten
This addon was presented a BeeCon 2016. You can find additionals details at Integrating a simple OCR in Alfresco (http://beecon.buzz/talks/?id=20160125005)
|License Type||GNU Library or "Lesser" General Public License (LGPL)|
|Project Page||GitHub - keensoft/alfresco-simple-ocr: Simple OCR action for Alfresco|
|Download Page||Releases · keensoft/alfresco-simple-ocr · GitHub|
|Extension Points||Public API,Behavior,Content Model,Content Policy,Web Script|