Core Platform Team/Initiative/Image Suggestion API/Epics, User Stories, and Requirements

Image Recommendation API (Proof of Concept)
List Unillustrated Articles and their image suggestions


 * As a developer, when I make a request to the Image Recommendation API,
 * I expect to see a list of unillustrated articles and their image suggestions
 * The list should be at most 10 images per 1 page request
 * Of the 10 images, at most 3 of the images should be from ImageMatchingAlgorithm and 7-10 images should be from MediaSearch

List Image Recommendations for all Wikipedia languages


 * As a developer, when I make a request to the Image Recommendation API with a page title,
 * I expect to be able to make requests for all Wikipedia projects in any language
 * e.g. Arabic, Cebuano, English and Vietnamese Wikipedia

Provide the Image Source and Confidence Rating of an Image


 * As a developer, when I receive a list of images
 * I expect to know the source of how the recommendation was provided
 * e.g. I see the image recommendation for the Frog page is from "Commons"
 * I expect to know the confidence rating for each image recommended per page requested
 * e.g. I see that the image for "Amazonian Tree Frog.jpg" has a confidence rating of "high"

Filter # of Image Recommendations Per Article Request


 * As a developer, when I provide a parameter to limit the number of image recommendations per page
 * I expect to get somewhere between 1 and 10 images recommended per page requested

Non-Functional Requirements

 * Authorization/Authentication
 * Performance Metrics
 * As a member of the Performance Team, I want the Image Recommendation API Response time to be less than or equal to 250ms RTT (not including network latency)
 * Uptime
 * Average and Max Latency
 * Errors Per Minute
 * API Product Metrics
 * API Usage
 * Unique API Customers
 * Data metrics
 * As a member of the Platform Team, I want the Image Recommendation data pipeline to respect system and data quality SLOs.
 * System
 * Spark sinks (in / out records, cpu usage, memory usage, executor counts
 * Datasets
 * Summary of population statistics (purpose: identify regressions, population/model drift, anomaly detection)
 * Size and counts of intermediate and final datasets (purpose: identify regressions)
 * ML Metrics
 * Accuracy by
 * Method (ImageMatching Algorithm, MediaSearch)
 * Sources (WikiData, Commons, etc.)
 * Recommendations resulting in
 * Rejections
 * Applied Edits
 * Skips
 * Documentation