Solr query for same keyword returning different results? -

- March 15, 2014

I am using the text_general type to search in the Solr index with the configuration below.

  & lt; FieldType name = "text_general" class = "solr.TextField" statusIncantgap = "100" & gt; & Lt; Analyzer Type = "Index" & gt; & Lt; Tokenizer class = "solr.StandardTokenizerFactory" /> & Lt; Filter class = "solr.SnowballPorterFilterFactory" /> & Lt; Filter class = "org.apache.solr.analysis.WordDelimiterFilterFactory" generWordParts = "1" generNumberParts = "1" catenateWords = "1" catenateNumbers = "1" catenateAll = "1" splitOnCaseChange = "1" splitOnNumerics = "1" protected Basic = "1" stem EnglishPositive = "1" /> & Lt; Ignore filter class = "solr.StopFilterFactory" = "true" word = "stopwords.txt" /> & Lt; -! In this example, we will only use synonyms from query time at & lt; Filter class = "solr.SynonymFilterFactory" Synonyms = "index_synonyms.txt" ignoreCase = "true" expansion = "false" /> - & gt; & Lt; Filter class = "solr.LowerCaseFilterFactory" /> & Lt; / Analyzer & gt; & Lt; Analyzer type = "query" & gt; & Lt; Tokenizer class = "solr.StandardTokenizerFactory" /> & Lt; Filter class = "solr.SnowballPorterFilterFactory" /> & Lt; Ignore filter class = "solr.StopFilterFactory" = "true" word = "stopwords.txt" /> & Lt; Ignore filter class = "solr.SynonymFilterFactory" synonyms = "synonyms.txt" = "true" detailed = "true" /> & lt; Filter class = "solr.LowerCaseFilterFactory" /> & Lt; / Analyzer & gt; & Lt; / FieldType & gt;

I have indexed a lot of content and searched with keywords: Please, please and please.

Please give the keyword query a very small result.

q =% 22PLEASE% 22 & amp; Q.op = or & amp; DF = Text & amp; Qt =% 2Fselect & amp; Type = CONTENT_NAME + desc & amp; Fq = content_source% 3asharepoint & amp; AuthenticatedUserName = Fine

but please & amp; Please give large resultset

q =% 22please% 22 &. Q.op = or & amp; DF = Text & amp; Qt =% 2Fselect & amp; Type = CONTENT_NAME + desc & amp; Fq = content_source% 3asharepoint & amp; AuthenticatedUserName = Fine

q =% 22Please% 22 & amp; Q.op = or & amp; DF = Text & amp; Qt =% 2Fselect & amp; Type = CONTENT_NAME + desc & amp; Fq = content_source% 3asharepoint & amp; AuthenticatedUserName = Fine

Even when I am using WordDelimiterFilterFactory, please consider it, please & amp; Please as a single keyword?

Any ideas.

You have a fundamental conflict using your tokenizer and filter SnowBallPorterFilterFactory to work correctly. Input Required:

Public Ultimate Class PorterStemFilter TokenFilter

According to the porter-generated algorithm, the changed token section is expanded. Note: In order to filter the generated, the input must already be in lower case, so that you will need to work in order to use the LowerCaseFilter or LowerCaseTokenizer further in order to work properly under the Tokenizer series
< / Blockquote>

This will motivate you to run your LowerSafefilterhere before running the stream in SnowBallPorterFilterFactory.

You are also using WordDelimiterFilterFactory after generating -. Which means that new words will not be precipitated after running through WordDelimiterFilterFactory.

Fixing is not as simple as putting it in frontCarCaseFilterFactory, because when the Snowball PorterFilterFactory will fix the issue, then He will interfere with the Word Delimiter Filter Factory, in which case will create new words on the change.

I suggest trying the following order:

StandardTokenizerFactory

WordDelimiterFilterFactory

LowerCaseFilterFactory

SynonymFilterFactory

StopFilterFactory

SnowballPorterFilterFactory

When you start to use as many filters as this is difficult to get a correct sequence as this but I believe That it will solve your current issues. As always, I suggest running multiple tests with my usual words from my document set to see how well it matches your desired output.

Search This Blog

Sign

Solr query for same keyword returning different results? -

Comments

Post a Comment

Popular posts from this blog

java - org.apache.http.ProtocolException: Target host is not specified -

How to access user directory in lazarus? -

java - Gradle dependencies: compile project by relative path -