Change propagation

Change propagation is distributing changes between services, using the EventBus infrastructure. Its rules subscribe to specific topics in eventbus, and execute an action (typically a templated HTTP request, or a CDN purge) in response to each event.

Monitoring change propagation
A grafana dashboard exists to monitor the EventBus and change propagation services. For EventBus it contains some generic information about current throughput of the system, response timing and load. For change-propagation, the dashboard shows rules execution rate and rule backlogs for each rule, for normal processing and retries separately.

Rule backlog is the time between the creation of event and begging of the processing. If the backlog grows over time - change propagation can't keep up with the event rate and either concurrency should be increased, or some other action taken. Backlogs can have occasional spikes, but steady backlog growth is an clear indication of a problem.

Setting up a new rule
The most common task that might be needed to be done on the change propagation service is to set up a new rule. Currently all the rules are static and stored in the config file. On startup the rules will be read from the config, Kafka consumers will be created for each rule (as well as for the corresponding retry topic).

A rule contains of several pieces: Also, a switch-rule is supported, that mimics the semantics of  operator in the programming languages, but without fall through. A switch rule basically groups together several rules that listen to the same topic, have same semantics, but have mutually exclusive  parts. For a more detailed information about rule configuration use docs available in the repository.
 * Topic property configures which kafka topic should the rule listen to.
 * General rule configuration properties help you configure features like retries, ignoring errors, delays etc.
 * Match and match_not fields help you limit the rule execution to a specific subset of the events in the topic. For example, if you need to do a domain-spefic action you would need to add a regex match for a  event property.
 * Exec property configures what should be done: you can do a set of HTTP request, call some change-prop module, do a Varnish purge or emit a new event as a reaction.

When a new rule configuration is created, you should follow the process to include the rule:
 * Create a github pull request for the config.example.wikimedia.yaml file in the change-propagation repository on Github. Tips:
 * Use publicly available URIs if that's possible.
 * Good to create a unit test.
 * Configure retry policy and error ignoring
 * Wait for the services team to review.
 * When a pull request is merged, create the same change in gerrit for the puppet config file. The puppet config change might be different from your PR in the github repo since it needs to include some templating for the service hosts. Reviewers list should include a person from the services team and at least one person from the operations team.
 * After puppet is merged and deployed ask the services team to restart change propagation service so that it pick up the new config.