Hi,
I am woking with alfresco community 5.2 and now my client need to apply ocr functionality into alfresco. So, i tried to do that using simple ocr with pdfsandwich. This now working fine. But i need acurate the quality using tesseract attributes such as resolution and rgb.
I changed alfresco-global.properties file like below.
ocr.command=/usr/bin/pdfsandwich ocr.output.verbose=true ocr.output.file.prefix.command=-o ocr.extra.commands=-resolution 600 -rgb -verbose -lang spa+eng+fra
Then i tried to ocr through alfresco then it gives an error like below.
java.lang.RuntimeException: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 09220022 Failed to perform OCR transformation: Execution result: os: Linux command: /usr/bin/pdfsandwich -resolution 600 -rgb -verbose -lang spa+eng+fra /home/administrator/alfresco-dms/tomcat/temp/Alfresco/OCRTransformWorker_source_8261423176645436134.pdf -o /home/administrator/alfresco-dms/tomcat/temp/Alfresco/OCRTransformWorker_source_8261423176645436134_ocr.pdf succeeded: false exit code: 2 out: pdfsandwich version 0.1.7 Checking for convert: convert -version Version: ImageMagick 7.0.5-2 Q16 x86_64 2017-04-04 http://www.imagemagick.org Copyright: © 1999-2017 ImageMagick Studio LLC License: http://www.imagemagick.org/script/license.php Featur err: tesseract: /home/administrator/alfresco-dms/common/lib/libtiff.so.5: no version information available (required by /usr/lib/liblept.so.5) tesseract 3.04.01 leptonica-1.73 libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.56 : libtiff 4. at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:183) at es.keensoft.alfresco.ocr.OCRExtractAction.executeImpl(OCRExtractAction.java:119) at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:273) at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:856) at org.alfresco.repo.action.executer.CompositeActionExecuter.executeImpl(CompositeActionExecuter.java:73) at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:273) at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:856) at org.alfresco.repo.action.ActionServiceImpl.executeActionImpl(ActionServiceImpl.java:757) at org.alfresco.repo.action.ActionServiceImpl.executeAction(ActionServiceImpl.java:581) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at org.alfresco.repo.security.permissions.impl.AlwaysProceedMethodInterceptor.invoke(AlwaysProceedMethodInterceptor.java:41) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.alfresco.repo.security.permissions.impl.ExceptionTranslatorMethodInterceptor.invoke(ExceptionTranslatorMethodInterceptor.java:53) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.alfresco.repo.audit.AuditMethodInterceptor.invoke(AuditMethodInterceptor.java:166) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:96) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:260) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:94) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy44.executeAction(Unknown Source) at org.alfresco.repo.rule.RuleServiceImpl.executeAction(RuleServiceImpl.java:1278) at org.alfresco.repo.rule.RuleServiceImpl.executeRule(RuleServiceImpl.java:1272) at org.alfresco.repo.rule.RuleServiceImpl.executePendingRuleImpl(RuleServiceImpl.java:1218) at org.alfresco.repo.rule.RuleServiceImpl.executePendingRule(RuleServiceImpl.java:1161) at org.alfresco.repo.rule.RuleServiceImpl.executePendingRulesImpl(RuleServiceImpl.java:1127) at org.alfresco.repo.rule.RuleServiceImpl.executePendingRules(RuleServiceImpl.java:1100) at org.alfresco.repo.rule.RuleTransactionListener.beforeCommit(RuleTransactionListener.java:64) at org.alfresco.util.transaction.TransactionSupportUtil$TransactionSynchronizationImpl.doBeforeCommit(TransactionSupportUtil.java:535) at org.alfresco.util.transaction.TransactionSupportUtil$TransactionSynchronizationImpl.doBeforeCommit(TransactionSupportUtil.java:514) at org.alfresco.util.transaction.TransactionSupportUtil$TransactionSynchronizationImpl.beforeCommit(TransactionSupportUtil.java:479) at org.springframework.transaction.support.TransactionSynchronizationUtils.triggerBeforeCommit(TransactionSynchronizationUtils.java:95) at org.springframework.transaction.support.AbstractPlatformTransactionManager.triggerBeforeCommit(AbstractPlatformTransactionManager.java:925) at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:738) at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:724) at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:475) at org.alfresco.util.transaction.SpringAwareUserTransaction.commit(SpringAwareUserTransaction.java:482) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:486) at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.java:587) at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.java:656) at org.alfresco.repo.web.scripts.RepositoryContainer.executeScriptInternal(RepositoryContainer.java:428) at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:308) at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:399) at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:210) at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.java:132) at javax.servlet.http.HttpServlet.service(HttpServlet.java:731) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.alfresco.module.aosmodule.service.ContextRootFilter.doFilter(ContextRootFilter.java:93) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:68) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:506) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:962) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:445) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1115) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637) at org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run(AprEndpoint.java:2486) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)
But when i try the below failed command in terminal, it works fine & generated quality pdf as expected.
/usr/bin/pdfsandwich -resolution 600 -rgb -verbose -lang spa+eng+fra /home/administrator/alfresco-dms/tomcat/temp/Alfresco/OCRTransformWorker_source_8261423176645436134.pdf -o /home/administrator/alfresco-dms/tomcat/temp/Alfresco/OCRTransformWorker_source_8261423176645436134_ocr.pdf
But if i removed this -rgb attribute from alfresco-global.properties, it works but -resolution also not affected. I am wondering why is this happen. I need to know why these extra commands are not working which i added into alfresco-global.properties file.
If any other way to achieve also accepted. Plz help me
Thanks.
As per the documentation of PDFSandwich they have highlighted that -rgb needs to be used with caution.
http://www.tobias-elze.de/pdfsandwich/
-rgb use RGB color space for images (default: black and white); use with care: causes problems with some color spaces
Now as you have already mentioned that it is working fine if you are running from the command line then it must be something wrong with alfresco configuration. I have found a similar issue on alfresco gitHub maybe you find some help there.
@mitpatoliya wrote:As per the documentation of PDFSandwich they have highlighted that -rgb needs to be used with caution.
http://www.tobias-elze.de/pdfsandwich/
-rgb use RGB color space for images (default: black and white); use with care: causes problems with some color spacesNow as you have already mentioned that it is working fine if you are running from the command line then it must be something wrong with alfresco configuration. I have found a similar issue on alfresco gitHub maybe you find some help there.
https://github.com/keensoft/alfresco-simple-ocr/issues/47
Thanks for the reply. I added below properties to my alfresco-global.properties file.
img.root=/usr img.dyn=${img.root}/lib img.exe=${img.root}/bin/convert img.gslib=/usr/share/ghostscript/9.26/lib
But same thing happened. OCR is happening trough alfresco but -resolution 600 command is not working. No errors also. Can u plz help me further to solve this problem?
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.