AnsweredAssumed Answered

Lucene/SOLR Stemming Analyser

Question asked by stevegreenbaum on Jul 29, 2014
Latest reply on Aug 12, 2014 by stevegreenbaum
I am using the porter stemming analyser for d_content but noticed that stop words are not removed from the index for new documents I add.  I added the stopword filter to schema.xml underneath the lowercase filter but the stop words still exist in the index.  Is this the correct approach Is there another approach for combining porter and stopwords via configuration or does the porter class need to be modified?

  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/> 

Also, regarding the whitespace tokenizer which is set by default in schema.xml for alfrescodatatype, when is the whitespace tokenizer executed relative to the analysers associated with the Alfresco property data types (e.g., content, text) which are specified in the locale specific property files?