- The Lift Wing server cluster continues to be configured. Hopefully more good news on that soon.
- The team is starting to formalize a process for moving forward on the AI model governance process.
- Our Google Summer of Code candidates are beginning to submit applications for review. The GSoC fellows will work on retraining ORES models on Lift Wing.
About this board
A liveblog and forum about machine learning at Wikimedia. Create new topics to ask questions or post updates.
Weekly Team Update 2021-04-13
Weekly Team Update 2021-04-06
- We've achieved "Hello World" from the Lift Wing cluster. This means we are in the final push to get that cluster up and running. Kudos to the ML team's SREs and advice from volunteers.
- We've started to formalize our AI model governance work, putting together a plan for the coming year.
- We are applying for Google Summer of Code for a fellow to help retrain some models current on ORES to Lift Wing.
Updated Team Homepage
The updated team homepage is now live. This is just a start to a much larger expanding of the public information about our infrastructure, projects, and models. The goal of the effort is to lower the bars to collaboration and increase the transparency of the team and its work.
Weekly Team Update 2021-03-22
- Kubernetes cluster work continues. The team is working on establishing the istio and knative services as the next levels in the stack. These are critical components for KFServing. This process is slower for the rest of March due to staffing events.
- Google Cloud Compute credits! As part of our migration to Kubeflow, the Machine Learning team has been approved for some Google Cloud Compute credits for running a development Kubeflow instance for learning and experimentation by ourselves and the community. This will unblock some folks while Kubeflow is being deployed. There is no plan in using Google Cloud Compute in production.
- Annual planning is now in full swing. The team is discussing priorities and narrowing down on a set we think are ambitious but accomplishable in the next financial year. More on that soon.
Weekly Team Update 2021-03-16
- Lift Wing work continues. Currently the worker nodes are setup and we are working on a "Hello World" using the Lift Wing cluster. This continues our steady progress moving forward.
- We're starting conversations on AI model governance both within the foundation and the community.
- We are exploring how best to migrate the functionality of the models on ORES into Lift Wing, whether through a direct migration or a retraining using the original training data.
Followup on suggestion for Recent Changes log
Hi everyone. A little over a year ago I shared an idea for giving Recent Changes patrollers the option to see changes that had probably been submitted at times when very few reverts took place: https://en.wikipedia.org/wiki/Wikipedia_talk:STiki/Archive_27#Feature_request_by_Clayoquot Are there any updates on this? I'm wondering if it would help if I try to move it forward through community feature request channels, such as by starting a discussion on the English Wikipedia Village Pump.
Hey Clayoquot! Unfortunately I don't have any updates for you! At least not that I know about. I would recommend Village Pump and I'll keep track of it from my end! Sorry I couldn't help more!
Hi. No worries, thanks for getting back to me so quickly. I've put a proposal here: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical) . Looking forward to the discussion.
Experimenting With Livestreamed Public Office Hours
One of the things we've been working on is lower the bars to interacting with the Machine Learning team. As part of that we've been experimenting with weekly live office hours. During an impromptu 30 minute test stream around 30 folks showed up and discussed our infrastructure. In the future we will post times and have a regular public cadence.
There is a question around if Twitch or Youtube is a better platform, let us know what you think below.
I would prefer to see this content be streamed and also available on YouTube because it's a site that our firewall allows us access to versus Twitch.
Awesome thanks. Yeah I think YouTube is probably the best place, especially because Twitch deletes the videos after 14 days.
I think Youtube works well for now. If we find other (or multiple) places that the community prefers, we can multi-stream using something like OBS Studio.
Model Inventory and Reporting
We are currently in the early stages of discussing AI/ML model governance on the Machine Learning team. An important concept in this space is the idea of model reporting, or providing a single point of reference for discovery of all ML models in production. Previously, there was the ores-support-checklist, however, this required manual maintenance and was often out of date. We need an automated solution that provides both a singular view of all models in production (model registry), as well as a view containing detailed (and up-to-date) information about each model.
Our first step in addressing this issue was to take a current inventory of all models and gather data about what they do, the target language, and when they were last trained/tested/deployed. We produced a public CSV file with this data, which is available in this ticket: https://phabricator.wikimedia.org/T275709
Going forwards, we plan to experiment with the idea of using wiki pages as living documentation for each of our models. There is some prior art in this area, one interesting approach we are looking at is called model cards (Mitchell et. al 2018). We have a ticket open related to exploring these ideas a bit further: https://phabricator.wikimedia.org/T276398
How are you motivating your teams to keep these model cards or any related metadata up to date?!!
We think it is going to be a mix of manual and automated. Imagine a Wikipedia (technicaly Mediawiki.org) page for each model in production. Some of the page is populated with manually written text, other parts are live updated from the latest training of the model (AUC curves etc.)
Feedback On This Talk Page Wanted!
The Machine Learning team is trying something new and used this talk page as a forum / blog for interaction with the community. We'd love to hear people's feedback on this format, especially in comparison to say a mailing list or even IRC.
Awesome! Thanks for the help!
Machine Learning Team 18 Month Roadmap
Here is the slide deck for the Machine Learning team's 18 roadmap. I've made a few changes to allow for greater accessibility. We would love to hear your thoughts and questions.