User talk:CFloyd (WMF)/Feed Markup Documentation

About this board

BSitzmann (WMF) (talkcontribs)

As mentioned today on IRC, some wikis don't seem to be using the convention of making the topical/most important article link bold. Example: Hebrew wiki news page. This shows that the Topic article URL markup is important.

Reply to "Topic article URL"
MHolloway (WMF) (talkcontribs)

Overall this looks great!

I'm a little concerned, though, about the notion of an editor-specified article summary for TFA. Currently we get the article summary through RESTBase (which leverages TextExtracts under the hood). Would the RESTBase summary be superseded by the editor-specified summary, if present?

More importantly, our vandalism protection model for the feed relies on "hydrating" feed content via the summary endpoint. (See https://phabricator.wikimedia.org/T151073 for related discussion.) For something like a protected news template (maybe also On This Day, I haven't looked at that code) it's not so much a concern, but it looks like the article summary here would be specified in the article text and could be changed by anyone. With our current setup this would mean that a vandalized summary could stick around in the feed a lot longer than if the summary comes from RESTBase, since (IIUC) RESTBase purges article summaries on every page edit but this doesn't happen for content generated by MCS itself.

CFloyd (WMF) (talkcontribs)

I'm not sure whether this makes it or not, but the vandalism aspect is a good point.

To things come to mind here:

1. The content of the main page is highly protected by the community members. So vandalism here is much less frequent then on individual article pages.

2. Since adopting this strategy will mean using the main page now as the source of truth for featured articles, I assume we need a way to invalidate the cache for the main page, right? If thats the case, shouldn't we be able to catch any changes in the content of the main page. Meaning if a community member corrects vandalism, won't our cache of the main page be invalidated?

MHolloway (WMF) (talkcontribs)

It looks like I was mistaken about how the featured article blurb ends up on the main page -- it looks like it's entered manually in the (protected) Featured Article template (e.g., https://en.wikipedia.org/wiki/Wikipedia:Today%27s_featured_article/January_6,_2017). I'd assumed it was somehow pulled in from the page itself. If the template is where the summary markup would go, then yes, I think that would take care of the vandalism concern.

With that problem out of the picture, IMO updating the feed content becomes less urgent in general -- cached content only lives for an hour in RESTBase -- but it's true we may want a cache invalidation mechanism for main pages given how important they'd be in the feed. I don't think there's any page-specific cache invalidation mechanism currently but in principle I'm sure that's something the services team could set up.

Reply to "Article summary"
There are no older topics