Hi All,
Here is the link to my project GitHub - rastoh/ev-pdf-to-doc
Please look for service-context_bak.xml for the beans related to soffice at the path ev-pdf-to-doc/ev-repo-pdftodoc/src/main/resources/alfresco/module/ev-repo-pdftodoc/context
I am trying to convert pdf to doc using a custom transformer through soffice but not able to though I am able to successfully run the custom transformer using abiword but the transformation is not as good as soffice when ran manually.
I call the transformer through an action on the share UI. Through soffice the transformation do take place but the transformed file not generated at the specified target directory but inside my project. Below is the exception that I get while running through soffice.
2017-11-28 19:13:55,140 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Executing transformation **********
2017-11-28 19:13:55,141 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Got the content data for transformation **********
2017-11-28 19:13:55,141 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Trying transformation for the first time **********
2017-11-28 19:13:55,142 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Calling pdfToDocTransformWorker transform method for transformation **********
2017-11-28 19:13:55,142 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] source Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf
2017-11-28 19:13:55,143 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] target Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.doc
2017-11-28 19:13:56,551 DEBUG [org.alfresco.util.exec.RuntimeExec] [http-bio-8080-exec-15] Execution result:
os: Linux
command: /usr/bin/soffice --infilter=writer_pdf_import --headless --convert-to doc /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf --print-to-file /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.doc
succeeded: true
exit code: 0
out: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_5703572020147422740.doc using filter : MS Word 97
err: Error: source file could not be loaded
2017-11-28 19:13:56,551 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] EXIT VALUE: 0
2017-11-28 19:13:56,551 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDOUT: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_5703572020147422740.doc using filter : MS Word 97
2017-11-28 19:13:56,551 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDERR: Error: source file could not be loaded
2017-11-28 19:13:56,552 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Trying transformation for the second time **********
2017-11-28 19:13:56,553 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Calling pdfToDocTransformWorker transform method for transformation **********
2017-11-28 19:13:56,553 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] source Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf
2017-11-28 19:13:56,553 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] target Path = /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.doc // expected target path
2017-11-28 19:13:57,894 DEBUG [org.alfresco.util.exec.RuntimeExec] [http-bio-8080-exec-15] Execution result:
os: Linux
command: /usr/bin/soffice --infilter=writer_pdf_import --headless --convert-to doc /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf --print-to-file /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.doc
succeeded: true
exit code: 0
out: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_3942032914925752895.doc using filter : MS Word 97 // transformer runs and create the doc file
err: Error: source file could not be loaded // but as the file is created at wrong path the transformer is not able to load the file hence error
2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] EXIT VALUE: 0
2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDOUT: convert /tmp/Alfresco/PDFToDOCTransformWorker_source_3942032914925752895.pdf -> /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_3942032914925752895.doc using filter : MS Word 97
2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCTransformWorker] [http-bio-8080-exec-15] STDERR: Error: source file could not be loaded
2017-11-28 19:13:57,895 DEBUG [com.eisenvault.service.transform.PDFToDOCContentTransformer] [http-bio-8080-exec-15] **********
Content data is null **********
In the exception the target path for the file is /tmp/Alfresco/PDFToDOCTransformWorker_source_5703572020147422740.doc but this is getting generated at /home/hitenrastogi/Documents/Projects/R_and_D_Projects/PDF_to_DOC/ev-repo-pdftodoc/PDFToDOCTransformWorker_source_5703572020147422740.doc
The above is what I am able to figure out. Any help will be greatly appreciated.
Thanks
Hiten Rastogi
There is no way around it, the transformed file must be located at the path specified via the API. If the transformer is not generating the result at the correct path, you need to implement a thin wrapper that correctly moves / renames the result to the expected file path. Since you already have a custom implementation class (PDFToDOCContentTransformer) it should be trivial to add the necessary logic.
Thanks Axel for your input. I would like to know two things, if possible
1. Why the soffice is copying the file after generating to a different location because while running the command manually I can see the file generated at the same location. Is there something in Alfresco that is forcing the generation to a different directory ??
2. As I have just started playing around transformers I can't fully grasp how I can create the wrapper you are talking about in the suggestion. Any pointer or example would be much appreciated.
Thanks
Hiten Rastogi
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.