ORES/zh

ORES（/ɔɹz/） ，客观修订评估服务）是一个具有机器学习即服务并由评分平台团队维护的为维基媒体项目提供的网络服务和API. 该系统为自动化关键的维基工作而设计——例如破坏的检测和删除. 当前ORES生成的两种一般类型的分数是基于“编辑质量”和“文章质量”. The system is designed to help automate critical wiki-work – for example, vandalism detection and removal. Currently, the two general types of scores that ORES generates are in the context of “edit quality” and “article quality.”

ORES是一个后端服务，它并不会直接提供一个使用该分数的方式. 如果您要使用ORES分数，请见使用ORES分数的工具列表. 如果您的维基仍不支持ORES，请参见如何申请支持. If you'd like to use ORES scores, check our list of tools that use ORES scores. If ORES doesn't support your wiki yet, see our instructions for requesting support.

要查找您有关ORES的问题的回答么？请查阅我们的ORES常见问题.

编辑质量
维基媒体开放项目的一个最关键的问题是检查可能存在破坏性的贡献（“编辑”）. 同时也需要判定（不经意间造成破坏的）善意贡献者并给予其帮助. 该模型的目的是让从Special:RecentChanges摘要的过滤工作更加容易. 我们提供两个级别的编辑质量预测模型支持——基本和高级. There's also the need to identify good-faith contributors (who may be inadvertently causing damage) and offer them support. These models are intended to make the work of filtering through the Special:RecentChanges feed easier. We offer two levels of support for edit quality prediction models: basic and advanced.

基本支持
假定最具破坏性的编辑会被回退，而具有建设性的编辑不会被 （回退），我们可以根据一个维基的编辑历史（和回退历史）来建立模型. 这个模型易于建立，但仍面临很多回退编辑并非由于破坏的问题. 为了解决这个问题，我们建立了一个基于不良词汇的模型. This model is easy to set up, but it suffers from the problem that many edits are reverted for reasons other than damage and vandalism. To help that, we create a model based on bad words.


 * ─ 用于预测一个编辑是否可能最终被回退.

高级支持
除了假设以外，我们可以让编辑者训练ORES来使其决定哪些编辑确实是 （破坏性的），而哪些编辑应该是 （善意的）. 这需要社区志愿者额外的工作，但能对编辑质量提供更精准更细微的预测. 很多工具只有在高级支持可用时才能在一个目标维基工作. This requires additional work on the part of volunteers in the community, but it affords a more accurate and nuanced prediction with regards to the quality of an edit. Many tools will only function when advanced support is available for a target wiki.


 * ─ 预测一个编辑是否是破坏性的
 * ─ 预测一个编辑是否是善意保存的

文章质量
维基百科文章的质量是维基百科的核心问题. 必须审查和策划新页面，以确保垃圾邮件、故意破坏和攻击文章不会保留在Wiki中. 对于在初始策划中存活的文章，一些维基人会定期评估文章的质量，但这是高度劳动密集型的，并且评估通常是过时的. New pages must be reviewed and curated to ensure that spam, vandalism, and attack articles do not remain in the wiki. For articles that survive the initial curation, some of the Wikipedians periodically evaluate the quality of articles, but this is highly labor intensive and the assessments are often out of date.

质量审核支持
有问题的文章和草稿越快被移除越好. 审核新创建的页面可以花费大量的精力. 就如同编辑中反破坏的问题一样，机器预测有助于优先关注最有问题的新页面. 根据管理员删除页面时（见logging表）写下的留言，我们可以训练一个模型来用于预测需要快速删除的页面. 中文维基的快速删除方针请参见维基百科:快速删除方针. 对于中文模型，我们使用G3 “纯粹破坏”（也适用于人身攻击）、G11 “广告宣传”和G12 “无来源且负面的生者传记”. Curating new page creations can be a lot of work. Like the problem of counter-vandalism in edits, machine predictions can help curators focus on the most problematic new pages first. Based on comments left by admins when they delete pages (see the logging table), we can train a model to predict which pages will need quick deletion. See en:WP:CSD for a list of quick deletion reasons for English Wikipedia. For the English model, we used G3 "vandalism", G10 "attack", and G11 "spam".


 * ─ 预测一篇文章是否需要被快速删除（广告/破坏/攻击/OK）

等级评估支持
对于在初始策展中存活的文章，一些大型维基百科使用大致对应于英语维基百科1.0评估等级量表（商品质量）的量表来定期评估文章的质量. 进行这些评估非常有用，因为它可以帮助我们评估我们的进度并识别错失的机会（例如低质量的热门文章）. 但是，保持这些评估的最新状态具有挑战性，因此覆盖范围不一致. 这就是 机器学习模型派上用场的地方. 通过训练模型来复制人类执行的文章质量评估，我们可以使用计算机自动评估每篇文章和每个版本. 该模型已被用于帮助WikiProjects分类重新评估工作，并探索导致文章质量改进的编辑动态. Having these assessments is very useful because it helps us gauge our progress and identify missed opportunities (e.g., popular articles that are low quality). However, keeping these assessments up to date is challenging, so coverage is inconsistent. This is where the  machine learning model comes in handy. By training a model to replicate the article quality assessments that humans perform, we can automatically assess every article and every revision with a computer. This model has been used to help WikiProjects triage re-assessment work and to explore the editing dynamics that lead to article quality improvements.

条目质量模型将其预测打基础于条目的结构特征上. 例如这里有多少章节？有信息框么？多少个参考资料？这些参考资料使用cite模板么？但条目质量模型不评估写作质量，或是否有语气问题（例如推翻某种观点）. 然而，多数条目的结构特征看起来强有力地与好的写作和语气相关联，所以模型在实践中工作越来越好. E.g. How many sections are there? Is there an infobox? How many references? And do the references use a cite template? The articlequality model doesn't evaluate the quality of the writing or whether or not there's a tone problem (e.g. a point of view being pushed). However, many of the structural characteristics of articles seem to correlate strongly with good writing and tone, so the models work very well in practice.


 * ─ 预测一篇条目或草稿的（维基百科1.0类）评估等级

Topic routing


ORES' article topic model applies an intuitive top-down taxonomy to any article in Wikipedia -- even new article drafts. This topic routing is useful for curating new articles, building work lists, forming new WikiProjects, and analyzing coverage gaps.

ORES topic models are trained using word embeddings of the actual content. For each language, a language-specific embedding is learned and applied natively. Since this modeling strategy depends on the topic of the article, topic predictions may differ between languages depending on the topics present in the text of the article.

Curation support


The biggest difficulty with reviewing new articles is finding someone familiar with the subject matter to judge notability, relevance, and accuracy. Our  model is designed to route newly created articles based on their apparent topical nature to interested reviewers. The model is trained and tested against the first revision of articles and is thus suitable to use on new article drafts.


 * – predicts the topic of an a new article draft

Topic interest mapping


The topical relatedness of articles is an important concept for the organization of work in Wikipedia. Topical working groups have become a common strategy for managing content production and patrolling in Wikipedia. Yet a high-level hierarchy is not available or query-able for many reasons. The result is that anyone looking to organize around a topic or make a work-list has to do substantial manual work to identify the relevant articles. With our  model, these queries can be done automatically.


 * – predicts the topic of an article

支持列表
ORES支援列表显示ORES在各维基的支持状态和可用模块. 如果您没有看到您的维基在此列表内，或者您的维基没有您想使用的模块，您可以申请支持. If you don't see your wiki listed, or support for the model you'd like to use, you can request support.

API使用
ORES提供具象状态传输的API服务来动态获取每个编辑的分数信息. '''若要了解如何使用该API，请参见 https://ores.wikimedia.org. '''

如果需要用此服务查询大量的编辑，建议每次以至多50个编辑来批量请求，如下所述. 可以接受最多4个并行请求，請不要超過這個限制，不然ORES可能會變得不穩定，對於更多的查詢，您可以在本地執行ORES.

查询示例： |wp10&revids=34854345|485104318 http://ores.wmflabs.org/v3/scores/enwiki/?modelsdraftquality|wp10&revids34854345|485104318

查询示例： https://ores.wikimedia.org/v3/scores/wikidatawiki/421063984/damaging

EventStream usage
The ORES scores are also provided as an EventStream at https://stream.wikimedia.org/v2/stream/revision-score

本地使用
要在本地執行ORES，你可以用此來安裝ORES：

然後你應該能夠執行它：

你應該看到輸出是