Topic on Talk:Machine Learning

Machine Learning Weekly Update Nov 30, 2022

1
CAlbon (WMF) (talkcontribs)
  • NLLB-200 deployment
    • Major process continues on getting the NLLB-200 deployment live and running for the Content Translation Tool. I am confident we will make the January 1st deadline.
  • API Gateway
    • Tobias and Hugh made some major processes over the last two days and they are working on a patch that will allow the API Gateway to be used with Lift Wing. Specifically, when the patch is tested and rolled out Lift Wing will effectively be silently soft-launched on the API Gateway, making over 100 machine learning models available to everyone. The timeline for the patch being pushed to production is a few days.
    • After the patch is released I will start publishing some tutorials on getting started using Lift Wing and will ask folks both inside WMF and the community to start experimenting to help find bad user experiences and technical bugs.
  • Add-A-Link
    • Steady progress on the Add-A-Link models. Kevin continues to train and deploy new models while evaluating their performance.
  • Model Cards
    • We are having weekly standup meetings on model cards as we start to make them. We should start with the first of the model cards published in the next two weeks.
  • Lift Wing
    • The current focus of the Lift Wing work is on model performance and the k8s 1.23 upgrade.
    • Model Performance: Some of the larger models we are currently working with the Research team on are ~4GB loaded, which is causing prediction times to be over ten seconds. This is obviously too slow for any real-world case and we are exploring which out of a wide variety of strategies for improvement is best, from breaking the large model into smaller ones, optimizing the structure of the models, increasing the number of pods, etc.
    • K8s 1.23 update: Luca and Yannis are working through a list of tasks as part of the 1.23 update. We are making solid progress but it is also a major task.
  • DSE Cluster
    • Work has now started on tackling what we have called the “Kerbarrier”, which is the fact that Kubernetes and Hadoop use very different security models. Kubernetes uses a certificate-based approach while Hadoop uses a symmetric key cryptographic approach (called Kerberos). Building a way of bridging the gap between these two approaches so nodes on clusters can access HDFS has been a major challenge we have to know we would need to solve eventually and one of the reasons for starting the DSE Cluster experiment.
Reply to "Machine Learning Weekly Update Nov 30, 2022"