AnsweredAssumed Answered

Not able to index content of large pdfs

Question asked by hiten.rastogi on Jul 6, 2018
Latest reply on Jul 6, 2018 by mehe

Hi All,

 

We are uploading pdf files upto 200MB in our DMS but the content are not getting indexed. 

 

After searching we came to know that the maximum limit of pdf files that can be indexed are by default 10MB so we decided to override this prop to 1 GB content.metadataExtracter.pdf.maxDocumentSizeMB=1000 we then deleted our old indexes and restarted the DMS but no effect.

 

Then we also find out that the default timeout for metaDataExtractor was 20 milliseconds so we changed that to ~1 hour content.metadataExtracter.default.timeoutMs=3625000 but still no change.

 

 

Please guide what else needs to be done to get the index correctly.

 

 

Thanks

Hiten Rastogi

Outcomes