alfocr
An ACS Optical Character Recognition Module
The following module and engine provide the core features required to bring OCR capability to your Alfresco system. It uses the new Alfresco Transform framework to implement an OCR engine. The OCR engine utilizes several command line tools to extract, convert, clean, deskew, and eventually perform the optical character recognition of a scanned PDF/TIFF document. Through the configuration of renditions, it can produce a plain text, HOCR, or a PDF document with properly located embedded text. Once the rendition is created, you can use rules or other extension to perform further processing.
Owner | https://www.inteligr8.com |
Versions | ACS v6.1+ (community or enterprise) |
License Type | Proprietary; free for testing or home/personal use |
Project Page | https://www.inteligr8.com/#/products/ocr |
Download Page | Contact Inteligr8: sales@inteligr8.com |
Tags | alfresco acs ats ate tengine pdf tiff ocr hocr |
Component Type | Alfresco Platform JAR Module & Alfresco Transform Engine |
Extension Points | Alfresco Java Public API & Alfresco Transform Engine Base |
Installation | ACS Platform: JAR Module (sideloaded) & rendition configuration Alfresco Transform Engine: Spring Boot app or Docker Container |
Products | ACS ATS |
This is an early version of this product. Although it is fully functioning and stable, it strictly focuses on performing the OCR. The module is expected to be greatly expanded with features, based on feedback.