Thread:Talk:Search/Zero Width Joiner and Zero Width Non Joiner/reply (3)

Sorry for the super duper late reply, but, here goes:

I can use case folding to flatten all four of these examples into "the same" word from search's perspective. That is, NFKC with case folding tacked on the end.

Now some choices: 1. Do this on both the analyzers that we use for text or just the less exact one. If I just do the less exact one then the words that match without normalization will bubble above those that match with normalization. And, by default, "quoting" a word will not find it normalized. I'm leaning towards adding the normalization to both analyzers for this reason. 2. Should I add this to all languages, most languages, just languages for which I don't have a good default, or just languages that ask for it? Note that I'm actually waiting on a change upstream to enable me to add things to "all" or "most" languages. 3. Other stuff?