Monday, April 22, 2013

Stemming in SharePoint 2013 Search

Stemming essentially performs searches for words that share the same stem. Stemming in SharePoint 2013 is enabled by default and although it can be disabled, there is no UI option in the Search Results Web Part anymore. In addition, the EnableStemming property in the object model is also deprecated.

To disable stemming you need to:
  1. Export a Search Results Web Part
  2. Open the file in a text editor
  3. Set the EnableStemming property to false ("EnableStemming":false)
  4. Save the file
  5. Import the web part back onto the search results page
  6. Remove the existing web part from the search results page

While classic examples of stemming are explained as “run” and “ran” as well as “writes” and “wrote”, the results I am seeing are not so. The behavior I am experiencing in SharePoint 2013 is mostly plural and singular versions of words. For example “party” also presents results for “parties” (and vice-versa). “Search” is providing hits on “search” and “searches” but not “searcher”. I am not seeing "wrote" hits when I search for "write" nor do I see "ran" results when I search for "run".
Since there is no longer support for third-party stemming tools (and thus the previous registry entries are no longer used), the workaround to all of this is to include any additional stemming as synoynyms in a thesaurus file which may be loaded into SharePoint via PowerShell commands.

I have crawled 100GB of content so I should have a great sampling of words. Has anyone seen anything different with stemming?






7 comments:

  1. Hi Steve, i've following your blog and had been very helpful. Thanks a lot.
    I have little problem, (hope so). When i send a query with the exact match of a phrase, the result set dont show the highlighted phrase in the summary, instead of this highlight the noise words in plural and singular. The document include the exact match, but in the summary dont show that. do you have any idea what can be? Thanks in advance.

    ReplyDelete
    Replies
    1. can you give me an example of your query?

      Delete
    2. Yes, the scenario is: SharePoint has two document libraries with html and pdf document types. When perform a query "The International Human Rights" some of the results are highlighted with the exact frase, but others not . I open the document and the phrase it's contained.

      Thanks a lot.

      Delete
    3. Hmm. Not exactly sure. I can investigate when i get back from vacation.

      Delete
  2. Hi Steve -

    Can SharePoint search be equivalent to dtsearch tool?

    ReplyDelete
  3. How about Elbonia and Elbonian..

    ReplyDelete
    Replies
    1. I do not think stemming handles that case but the search engine may.

      Delete

Matched Content