Core Platform Team/Initiative/Image Suggestion API/Epics, User Stories, and Requirements

From mediawiki.org

Image Recommendation API (Proof of Concept)[edit]

List Unillustrated Articles and their image suggestions

  • As a developer, when I make a request to the Image Recommendation API,
    • I expect to see a list of unillustrated articles and their image suggestions
      • The list should be at most 10 images per 1 page request
      • Of the 10 images, at most 3 of the images should be from ImageMatchingAlgorithm and 7-10 images should be from MediaSearch


List Image Recommendations for all Wikipedia languages

  • As a developer, when I make a request to the Image Recommendation API with a page title,
    • I expect to be able to make requests for all Wikipedia projects in any language
      • e.g. Arabic, Cebuano, English and Vietnamese Wikipedia


Provide the Image Source and Confidence Rating of an Image

  • As a developer, when I receive a list of images
    • I expect to know the source of how the recommendation was provided
      • e.g. I see the image recommendation for the Frog page is from "Commons"
    • I expect to know the confidence rating for each image recommended per page requested
      • e.g. I see that the image for "Amazonian Tree Frog.jpg" has a confidence rating of "high"


Filter # of Image Recommendations Per Article Request

  • As a developer, when I provide a parameter to limit the number of image recommendations per page
    • I expect to get somewhere between 1 and 10 images recommended per page requested

Non-Functional Requirements[edit]

  • Authorization/Authentication
  • Performance Metrics
  • API Product Metrics
    • API Usage
    • Unique API Customers
  • Data metrics
    • As a member of the Platform Team, I want the Image Recommendation data pipeline to respect system and data quality SLOs.
    • System
      • Spark sinks (in / out records, cpu usage, memory usage, executor counts
    • Datasets
      • Summary of population statistics (purpose: identify regressions, population/model drift, anomaly detection)
      • Size and counts of intermediate and final datasets (purpose: identify regressions)
  • ML Metrics
    • Accuracy by
      • Method (ImageMatching Algorithm, MediaSearch)
      • Sources (WikiData, Commons, etc.)
    • Recommendations resulting in
      • Rejections
      • Applied Edits
      • Skips
  • Documentation


< Image Suggestion API