I am using alfresco for a long time. I scanned around 100,000 documents & uploaded into alfresco. But suddenly i faced a problem because it can't be read using it's content. If i scanned a document using a scanner having ocr then it can be. But, i don't have ocr on every scanner so i need to integrate OCR module into alfresco. I tried tesseract ocr & simple ocr, but both did not worked.
- If anyone knows, plz tell me another way to do this or correct way to integrate tesseract-ocr or simple-ocr
- I need to convert all uploaded document into searchable pdf also.