-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typo in query leads to "no results" #1038
Comments
hi @tadjik1, we currently don't support spelling error detection, it's a complex thing to get correct while also ensuring that query performance is not affected. other services return messages such as 'Showing results for Reichstag. No results found for Reichstog.' we haven't done any research into this area and likely won't have a solution in the short-term. off the top of my head there are two approaches for handling spelling mistakes: index-time permutation generationin this scenario, the original name is put through an algorithm which generates logical error cases within a certain threshold, based on mental errors (such as vowel substitution) and typing errors (such as pressing an adjacent key). this is a well-studied domain and many existing algorithms are available to produce these tokens. the issue with this approach is that the total index can expand to 20x it's size, meaning that a planet-wide index could expand from 1 billion to 20 billion entries, resulting in a severe decrease in search-time performance and disk / ram requirements. additionally, there may be a negative effect on search quality and some care would need to be taken for things like search-time permutation generationin this scenario, we take the search term and check if it exists in the index. If the term fails to match then we could run the same algorithm to generate a list of 'fallback terms'. it would then be possible to iterate through those fallback terms until a match was found. this approach would not increase the index size but would result in a slower response for queries with spelling errors. This is arguably better because only those queries containing a spelling error would have a decreased response time. The result would be similar to the fallback message I posted above 'Showing results for Reichstag. No results found for Reichstog.'. I doubt the core team will get a chance to look at this any time soon, there are a lot of edge cases to consider at planet scale and considering multiple languages. The good news is that the 'search-time permutation generation' option can be handled by a client library (ever in the browser) or added to the If you are interested in doing some work in this area I would be happy to have a discussion with you around how it might work and how we could get a PR merged in to master. |
@missinglink yes, sure. I would like to help with this feature. |
I don't sorry, maybe something like this? also worth a read: https://codeascraft.com/2017/05/01/modeling-spelling-correction-for-search-at-etsy |
Also worth having a look at the elasticsearch docs, I'm guessing there must be something in place to handle spelling mistakes and it might be easy enough to enable it? |
@missinglink seems that option I will prepare PR to pelias/query |
Closing this one as a duplicate of pelias/pelias#785 |
Hi there,
We've faced problem with empty list of results that
pelias-api
returns even with small typo in query. It's pretty easy to reproduce:It seems that such small typos shouldn't affect result list.
The text was updated successfully, but these errors were encountered: