Improving search with stemming and truncation
We have noticed that stemming in Vega Discover seems to work mainly for English. For example, ‘history’ also retrieves ‘histories’, but the Swedish word ‘historia’ does not retrieve ‘historier’. Since we are a Swedish library, it is important that stemming also works effectively for Swedish, not only for English.
As stemming can be challenging and sometimes unreliable, truncation is a vital complement. Truncation gives users more control and helps capture all inflected forms or compound variations of a word (which is especially important in Swedish). For example, searching for compounded words with a wildcard (e.g. ‘histor*) would ensure broader and more relevant results. We would therefore like to propose:
- Improved stemming support for Swedish (and for other non-English languages).
- Reliable and user-friendly truncation functionality, so that users can compensate where stemming does not fully cover their needs.
This would significantly improve search precision and recall and ensure that Vega Discover is equally effective across different language contexts.
