FINGERPRINT questions in Solr 6

Question asked by dbiggins on Apr 16, 2018
I appreciate the updated info about using the FINGERPRINT function in AFTS queries, but in the process of testing the search, I came up with some questions about how things should work.  Specifically:


  •  What is the default overlap percentage if I don't specify it as the second value?  When I run the AFTS query 'FINGERPRINT:52763', where 52763 is the DBID, I get 487 results.  When I supply any overlap percentage ranging from 'FINGERPRINT:52763_1' to 'FINGERPRINT:52763_99', I get the same 1 result, which is the document I am using as the source in the search.
  • I assume that the FINGERPRINT's minhash is generated when the doc (mostly PDFs in our case) is created or updated.  Should I ALWAYS receive one row (the source document) for the FINGERPRINT query if the text is Tika extractable?
  • I have two PDFs that are almost identical that aren't showing up in each others FINGERPRINT queries, and in fact, return 0 rows.  Does that mean there was a problem extracting the text for the minhash?  If so, how do I query if the minhash is empty?


I am using Alfresco Community 5.2 (201707), and Alfresco Search Services 1.1.


Thanks everyone!