My experience with ORES was not encouraging. I had applied for ORES to be initiated on sw:wp. At the preliminary step of creating a blacklist, the algorithm identified predominantly English words rather than unacceptable Swahili ones. And then, I had too many other (more?) worthwhile projects on my hands (and no other enthusiastic cooperators to go through stuff manually) ... Despite all the rhetoric about supporting small, local, indigenous, vernacular, minority, under-represented and/or endangered languages, internet technology is still largely anglocentric :(
Topic on Talk:JADE
ORES needs to support more languages. +1 for Swahili support.
Hi Baba Tabita. I'm sorry you seem to be having a bad time with ORES. I remember reaching out to you to ask for help with alternative means of getting blacklists for ORES to work with. We've had to do that for a few languages. As far as I know, we've been waiting on that. FWIW, our BWDS system picked up English Words for Swahili because much of the damaging edits in Swahili wiki add English language content. So that's not really a bug but rather a limitation in our process for auto-detecting badwords using wiki edits. It does however tell us that an English language dictionary would be useful for damage detection in Swahili wiki.
Regardless, I'm not sure that you could accuse of not matching our rhetoric to work with anyone who will work with us. I believe we have been quick to respond to your questions and to make suggestions about next steps. Also, FWIW, we're not focused on small, local languages. We're focused on supporting growing communities -- small language or not! See m:Community Engagement/Defining Emerging Communities.
In the end, there's only two of us who are staffing the team that does something Wikimedia has never done before so I hope you'll understand that we need your help in order to support your wiki. I have lots of other wikis and operational concerns to track and I put in a lot of volunteer time to make this experiment in community AI resources work at all.
Currently, we have support for 34 languages -- at least 11 of which are not heavily used in the en:western world -- so I think we're doing OK in most other instances.
Thanks for the timely reply!
And sorry if I came over as accusing or complaining, none of which was intended. I think you *are* doing a great job. It's just that circumstances are so much more favourable for European languages than, say, for African ones. I'm not a tech guy, so that adds to my frustration of not being able to do what I would like for Swahili localization in the little time available to me. Definitely not your fault! Still, just saying ...
Best wishes, and please keep up the good work!
Thanks Baba Tabita. Maybe I could look up some supposed bad word lists to help get us started. I'll ping on the ticket.