AnsweredAssumed Answered

Not able run custom transformer(pdf_to_doc) through soffice

Question asked by hiten.rastogi on Nov 27, 2017
Latest reply on Nov 30, 2017 by afaust

Hi All,

 

Here is the link to my project GitHub - rastoh/ev-pdf-to-doc 

 

Please look for service-context_bak.xml for the beans related to soffice at the path ev-pdf-to-doc/ev-repo-pdftodoc/src/main/resources/alfresco/module/ev-repo-pdftodoc/context

 

I am trying to convert pdf to doc using a custom transformer through soffice but not able to though I am able to successfully run the custom transformer using abiword but the transformation is not as good as soffice when ran manually.

 

I call the transformer through an action on the share UI. Through soffice the transformation do take place but the transformed file not generated at the specified target directory but inside my project. Below is the exception that I get while running through soffice.

 

2017-11-28 19:13:55,140 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Executing transformation **********

2017-11-28 19:13:55,141 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Got the content data for transformation **********

2017-11-28 19:13:55,141 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Trying transformation for the first time **********

2017-11-28 19:13:55,142 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Calling pdfToDocTransformWorker transform method for transformation **********

2017-11-28 19:13:55,142 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] source Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf
2017-11-28 19:13:55,143 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] target Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.doc
2017-11-28 19:13:56,551 DEBUG [org.alfresco.util.exec.RuntimeExec] [http-bio-8080-exec-15] Execution result:
os: Linux
command: /usr/bin/soffice --infilter=writer_pdf_import --headless --convert-to doc /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf --print-to-file /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.doc
succeeded: true
exit code: 0
out: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_5703572020147422740.doc using filter : MS Word 97

err: Error: source file could not be loaded

2017-11-28 19:13:56,551 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] EXIT VALUE: 0
2017-11-28 19:13:56,551 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDOUT: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_5703572020147422740.doc using filter : MS Word 97

2017-11-28 19:13:56,551 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDERR: Error: source file could not be loaded

2017-11-28 19:13:56,552 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Trying transformation for the second time **********

2017-11-28 19:13:56,553 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Calling pdfToDocTransformWorker transform method for transformation **********

2017-11-28 19:13:56,553 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] source Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf
2017-11-28 19:13:56,553 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] target Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.doc // expected target path
2017-11-28 19:13:57,894 DEBUG [org.alfresco.util.exec.RuntimeExec] [http-bio-8080-exec-15] Execution result:
os: Linux
command: /usr/bin/soffice --infilter=writer_pdf_import --headless --convert-to doc /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf --print-to-file /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.doc
succeeded: true
exit code: 0
out: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_3942032914925752895.doc using filter : MS Word 97  // transformer runs and create the doc file

err: Error: source file could not be loaded // but as the file is created at wrong path the transformer is not able to load the file hence error

2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] EXIT VALUE: 0
2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDOUT: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_3942032914925752895.doc using filter : MS Word 97

2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDERR: Error: source file could not be loaded

2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Content data is null **********

 

 

 

In the exception the target path for the file is /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.doc but this is getting generated at /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_5703572020147422740.doc

 

The above is what I am able to figure out. Any help will be greatly appreciated.

 

Thanks

Hiten Rastogi

Outcomes