User talk:Daniel Kinzler (WMDE)/DependencyEngine

About this board

Tgr (WMF) (talkcontribs)

How is the caller of popDirty supposed to interpret the URL? It seems like you are baking in the assumption that the only kind of update being done here are HTTP cache purges. What if something wants to e.g. cache structured data from some MCR slot in memcache, and needs that purged on update? IMO it would make more sense to provide a callback instead of an URL. You would end up with something not unlike the current job queue, except dependency tracking and the generation of dependent jobs would happen outside the queue, which would remove much of the complexity and the job count explosion.

Reply to "Resource URL"

Dependency tracking vs. graph traversal

1
Tgr (WMF) (talkcontribs)

The description of the service is a little confusing. The text says you want a service that knows about dependencies, but the interface expects the caller to provide them. There would presumably some kind of updater daemon that calls popDirty periodically, but then the dependencies of the popped URLs also need to be updated -is that the responsibility of the caller (in which case how does it know what they are?) or the service (in which case how does it know when the update is finished and a the resource safe to use?)?

I get the feeling multiple concepts are mixed together here. The responsibility of a depencency tracking service is to track dependencies and provide information about them. It would have interface methods for declaring / removing resources, declaring / removing dependencies between them, maybe adding callbacks for dynamically evaluated dependencies, and fetching the dependencies of a specific resource. (And probably track cache expiry times as that's pretty closely related to the dependency graph.) Actually traversing the graph should be done by a different class (which probably has no reason to be a service). Also as @Pchelolo says there is the question of push vs. pull, and there is no reason for the service to know anything about that.

Reply to "Dependency tracking vs. graph traversal"
Tgr (WMF) (talkcontribs)

changeprop does that to some extent, it's just not very well integrated with MediaWiki.

Pchelolo (talkcontribs)

ChangeProp is actually a little bit different thing.

For tracking the dependency graph it relies on MediaWiki and fetches the dependencies from the the link tables. The role of the changeprop is more an update scheduler then dependency tracking.

Also, if I understand the "popDirty" proposal correctly, it implies a pop-based approach, something like what's currently implemented in the Redis-based Job Queue and the job runner service. Recent issues with the job queue show that this type of scheduling is hard to do right (all the examples when unbounded growth of backlog on a single project was affecting other projects and wasn't easy to solve even by incrementing the number of runners for that particular wiki). ChangeProp uses push-based approach when the updates are done in order they arrive strictly sequentially, possibly partitioned by wiki or a DB shard.

Tgr (WMF) (talkcontribs)

@Pchelolo I meant that ChangeProp has its own built-in dependency graph in the form of rules.

Reply to "changeprop?"
There are no older topics