Hello,
I would like to play a little bit with stop words presented in Lucene bundled in Confluence instance. The reason is that we have Service Desk connected to Knowledge base in confluence and word "How" in our language (Czech) is actually in stop words, but it does not make sense to users, because they do not receive any results in SD, and actually error message "The query could not be parsed" in Confluence itself, which is not nice at all.
Confluence.atlassian.com and also answers.atlassian.com indexes the word "How" because it make sense, it leads users to type more exaxt query, which error message really does not do and users understand that as misbehaving application (and I agree with them).
So the question is, can I change Lucene stop word dictionary for my language/global somewhere?
Thank you.
Community moderators have prevented the ability to post new answers.
Ok, I have figured out it by myself. Here is step by step guide how to hack stopwords for each language.
I have also removed <home_directory>/index and <home_directory>/journal to force full reindexing during start.
Enjoy...
Let me point one thing. I do not know where Atlassian get this stopwords files, but it contains obvious nonsences, at least for czech language. Why words like: article, today or write should be marked as stopwords? What about "how to create an article" - 4 of 5 words are stopwords, why is that?
More funny part is, that it actually contains even non existing words like re, neg, aj, pta etc. Those does not exists in czech language
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.