Wikimedia Discovery/Meetings/Search retrospective 2017-01-25
Note that "mad" and "sad" don't have to mean literally angry or saddened. They can be used in a playful way as well.
The previous retrospective was a Team Health Check: https://www.mediawiki.org/wiki/Wikimedia_Discovery/Meetings/Search_team_health_check_2016-11-30
Action items from the previous retro:
- Dan: work with apps team on full-text searching thing Done (Passed along info)
- Dan: helping slow moving UI work move faster -- Seems to be going faster now, although not because of specific action by Dan
What has happened (since 2016-10-27)?
(Mostly pulled from Discovery/Status updates)
- Inter-wiki search progress (http://sistersearch.wmflabs.org/ )
- Closed many backlog tasks based on earlier work
- EPIC: Review current ElasticSearch configuration, and use relevance lab to run tests to optimise the configuration to improve search result relevance
- Implement a new fulltext query
- Image search by file size and and file type
- BM25 is now enabled on the ten wikis with largest traffic.
- Determined goals for the upcoming quarter
- A workshop was held in Germany on advanced search syntax on Wikipedia
- Load tests for cross-project searching were completed successfully.
- We've put together a draft proposal for how to deal with the interaction of all the possible additional search options
- Upgraded to Java 8
- The time needed to restart our elasticsearch clusters is improving (T145065)
- Dev Summit/All Hands
- Secondary search results are now possible over the API!
- Finalized the second BM25 testing analysis
- Finished writing up, summarizing, and recommending extensive changes to TextCat for language identification
- Refactoring and cleanup, including moving phan to Jenkins
- Guillaume investigating on I/O performance of elasticsearch servers
- New elasticsearch and WDQS servers racked and (almost) configured
- Katie asked us about how we manage quality
- Created a new search/Learn to Rank (LTR) plugin (not prod-ready)
- Various investigations of using dynamic bayessian networks for estimating relevance
What has made you mad?
- Timeout issues with insource (regex search) in production. Still not fixed.
- Google hangouts being stupid and slow+1 oh yes
- (Google hangout is SOO much better than the proprietary solution I was used to)
- My untameable hair
- My webcam not working after OS update (mine too - had to get a new one from tech)
- security patch breaking production - something is missing there in the process
- Maybe we should nudge a bit to see if the process could be improved
What has made you sad?
- Realizing there is so much I do not understand about disk IO
- Mikhail was unable to attend today's retro
- Yuri leaving Discovery (no longer staff; remaining as a volunteer)
- Discussion on quality died too soon (was it wrapped up or just left hanging?)
- This was from a question from Katie. Seems to have just been left hanging. Maybe we are happy?
- Main focus of Katie may have been regarding other teams
- Results of initial learning to rank experiments were only promising for popular queries (because it was trained on popular queries)
- kerfluffle over Interactive Team (and timing of announce+vacation)
- sticking my nose back in GC tuning (I already know this is going to eat time like crazy)
- BM25 being delayed by technical difficulties (rightly so, but still disappointed)
- Elasticsearch major upgrades still require full cluster restart
- That's the ES plan, so not something we can necessarily fix
- Somewhat weak attendance (by everyone else, not Discovery people) at the Discovery quarterly review... hard to interpret whether this is confidence or indifference in our work
- Reading was more full
- Probably doesn't even mean anything, honestly
What has made you glad?
- Seeing people in person at the All Hands! +1 +1+1+1+1+1
- A lot of good planning and designing stuff done @ dev summit & all hands
- Eating in-n-out burger during the All Hands!+1 (note: In-n-Out != White Castle) (so true!)
- Starting to understand *some* things about disk IO
- Seeing the progress on the Labs instance for the front end/back end of the new sister search +1
- talking with other PMs (outside of Discovery) about how to improve discovery of articles written in other langages (other than English)
- The excellent conversation and documentation around solving problems and researching solutions (a.k.a. Trey's notes) +that!+1+1+1
- Knowing it is possible to restart an elasticsearch node in < 2 minutes
- Big improvements in TextCat performance (on test data, but still)+1
- ES 2 -> 5 doesn't seem to be as big a change as 1 -> 2
- Quarterly review went well and the audience seemed excited about the new search stuff coming out in Q3+1
- Getting more involved with the (entire) Search team again (Deb)+1
- Dan's untamable hair
What else is on your mind?
- Do we have longer-term maintenance plan for things interactive team has been doing (maps, graphoid, etc.)?
- Will Discovery continue to make sense if we have just 2 projects, one of which only has Stas working on it?
- We still have portal and analysis as well
- We're not aware of any thoughts or discussions about Discovery going away
- Wasn't aware of stuff happening with the interactive team until the announcement
- Doesn't directly affect my work
- Says something about Discovery that one team is being disbanded and that isn't affecting the other teams. Low cohesion.
- Not unique to Discovery. Even worse in reading, since they have a micro-vertical. Same in editing.
- Being remote leads to hearing less gossip (not that gossip is a good thing)
- These teams (search and interactive) were pretty siloed. Maybe we should look for ways for members to support other teams.
- Remoteness is not actually a huge factor in these kinds of things, so this is a broader communication issue
- People in the office might have noticed other people looking unhappy
- Seemed like it happened so quickly
- It had been brewing for months
- Some had been documented
- Communication is bidirectional. The interactive team didn't communicate out as much as they could have.
- We used to have all-Discovery retrospectives every month
- We stopped due to scheduling problems, retros getting longer, and more conversations not relevant to most people in the room
- Maybe all-Discovery retros would have shared knowledge of some of the issues with the interactive team
- If more of us had known about problems, how might we have helped?
- We don't have a lot of people moving between teams or asking for help from other teams (which has pros and cons)
- Do we have a plan for long-term maintenance of the work the interactive team was doing?
- Plans are being made; discussions are active
- We chose an imperfect early announcement rather than a later perfect one.
- Stas: Look into (talking about) improving how security patches are handled
- Kevin: Consider scheduling a work-centric version of the unmeeting
- or maybe a hangout that's always on? or on for a period of time to talk about stuff?
- Maybe every 4th unmeeting?
- Maybe allow non-tech work conversations in unmeetings?