Jump to navigation Jump to search
This is the weekly update for the week starting 2018-06-04
- After lots of talk about stemmers getting committed and plugins getting deployed, the Slovak-language wikis have finally been *reindexed*, and stemming  is now happening on the Slovak wikis!
Search—Time Machine Edition
A few things from May that got missed:
- Trey wrote up some potential applications of natural language processing (NLP) to on-wiki search . We're still going through them to pick out a couple that we'll turn into projects, probably next quarter. Right now, spelling correction and entity extraction are high on the list, but more questions, comments, and suggestions are welcome.
- Erik pulled 90 days worth of regular expression (regex) searches across all wikis, and Trey did a quick survey of the most common patterns.  There are a lot more regex searches than we thought—5.6 million in 90 days!—and three apparently automated processes (bots, apps, or tools of some kind) are responsible for more than 90% of the regex searches.