Fundraising tech/Message queues

From MediaWiki.org
Jump to: navigation, search

This page gives an overview of the message queues used to decouple fundraising subsystems. For a description of the message formats, see "Normalized donation messages". See also the Wikitech article on WMF-specific configuration.

Message Queue[edit]

Queues are used to decouple the payments frontend from the CiviCRM server. This is important for several reasonsː it allows us to continue accepting donations even if the backend servers are down, it keeps our private database more secure, and it enforces write-only communication from the payments cluster.

The main data flow is over the donations queue. Completed payment transactions are encoded as JSON and sent over the wire, to be consumed by the queue2civicrm Drupal module and recorded in the CiviCRM database.

Another important queue is the limbo queue, which is used both as a key-value store and as a FIFO queue. Before we redirect the donor to a payment processor hosted page or iframe, we record the donor's personal information and push it to the limbo queue, where it's indexed by gateway name and transaction ID. We store this information in a temporary fashion rather than in a database out of concern for storing data about people who aren't donors. When (and if) control is returned to the payments server, the PHP session is used to build the key and search for a corresponding limbo message. We delete the message, and merge this information into the completed donation message sent to the regular queue.

However, if control is never returned, then limbo queue messages will sit around for some time in case the data will become useful again. After about 20 minutes, they become eligible for orphan rectification, currently only applied to GlobalCollect credit card transactions. We attempt to complete settlement on these orders, and if successful, the completed message including limbo-provided details is sent to the donations queue. If unsuccessful, the personal information should be purged.

At Wikimedia, we are currently using the ActiveMQ (http://activemq.apache.org/) message broker as the queue backend for everything but some limbo and inflight queues. Messages go over the wire using the aging STOMP protocol.

Queues[edit]

Queue Platform Producers Consumers Description Potential migration
donations activemq payments, py-audit-(paypal,worldpay), listeners crm-queue2civi Primary queue for incoming donations, written to whenever we learn about a successful payment. Redis
donations_recurring activemq crm-audit, py-audit crm-recurring Information about changes in donor monthly subscriptions. Redis
refund-notifications activemq py-audit-(paypal,worldpay) crm-refund incoming IPN notifications that the processor has completed a refund Redis
pending activemq payments SmashPig-job-runner (Adyen), crm-queue2civi (Amazon) Temporary storage used by some IPN listeners and frontends. Messages will either be popped off the queue after a fixed time, to complete settlement steps; or upon incoming notification of a status change; or will expire in a short amount of time. Redis dumping into a new pending database
limbo payments-memcache payments (ingenico) ? Donation methods which use an iframe (GC, Adyen) will leave a message on the limbo queue before transferring UI flow to the processor. When the processor returns control, we delete the limbo message. NOTE: this is wired to activemq in SmashPig, but memcache in DI. Whassup? pending
limbo activemq ? SmashPig-? See other limbo queue note--this might be a bug. deprecate in favor of pending
cc-limbo activemq none? Deprecated? deprecate in favor of pending
globalcollect-cc-limbo payments-redis payments orphan slayer See the "limbo" queue above. GlobalCollect credit card limbo messages are segregated into their own bucket, to make it easier for the orphan slayer to pull only those messages. Each payments box stores to its own local redis server, and the orphan slayer reads from them round-robin. pending
payments-init activemq payments crm-fredge One entry for each transaction that leaves our flow control. Redis
payments-antifraud activemq payments crm-fredge Data on transaction fraud scores Redis
unsubscribe activemq unsubscribe page crm-unsubscribe The FundraisingEmailUnsubscribe module allows donors to opt out of bulk mailings and sends these requests over a queue, to be consumed by the CRM. Redis
banner-history activemq payments crm-banner_history Used to capture correlations between contribution tracking ID and banner history logging ID. Redis
job-requests activemq SmashPig-adyen-capture SmashPig-job-runner SmashPig jobs to be processed. TODO: explain. TBD. Maybe Redis into a database?
inflight filesystem SmashPig-listener TODO: cannot read yet Homebrew, broken transactional thing eliminate or use pending
contribution_tracking mysql payments, crm-queue2civi crm-queue2civi, analytics Hybrid write-only log and analytic store. Stage 2. UUID; Redis; mysql
error logging (completion data) syslog, filesystem payments crm-audit Syslog, but also mined to reconstruct missing transaction data. TODO: standardize line format, use normalized data and not gateway XML. Stage 2: ? Kafka to something amazing like a file stream per transaction.
failed activemq SmashPig-? X Deprecated, TODO: remove from SmashPig
pending_globalcollect Deprecated
pending_paypal legacy-listener-paypal Deprecated. Stops 2016-02-07
pending_paypal_recurring activemq legacy-listener-paypal ? TODO: Deprecate. Written to by the legacy PayPal listener, but only 20% of messages are read again. pending
donations-gc-garbage activemq SmashPig-listener-ingenico ? TODO: Deprecate.
banner-impressions kafka-filesystem wmf-varnish legacy-bannerimpressions-job TODO: Use Kafka client without filesystem layer TBD. Kafka into mysql?

Components[edit]

Component Operation Queue Description
donatewiki get sequence contribution_tracking Donatewiki is the first stop where we record contribution tracking data. The donor is given a contribution_tracking ID to associate with their cookies.
push contribution_tracking
DI-gateway-generic push payments-init Donation collection frontend, which can create successful donation messages.
push complete
push pending When processing becomes asynchronous or risky, a copy of the transaction is pushed to one of the temporary "pending" queues, usually waiting to integrate some response from a payment provider.
get sequence contribution_tracking
push contribution_tracking
DI-adyen, amazon, astropay set by id pending TODO: Review whether we can just send these messages, with a short expiry or antimessages...
delete by id pending
DI-ingenico set by id globalcollect-cc-limbo, limbo
delete by id globalcollect-cc-limbo, limbo
Ingenico orphan rectifier peek globalcollect-cc-limbo A script that attempts to "rectify" the orphaned message by settling at the gateway, for messages older than 5 minutes.
delete by id globalcollect-cc-limbo
crm-audit, py-audit push donations, donations_recurring, refund Historical sources of payment events.
crm-audit search by id error_logs This funky feature is to pull otherwise unconsumed information about failed transactions back into memory, to gather context for incoming notifications.
SP-listener-generic push inflight A mini-pending arrangement for transactional processing.
delete by id inflight
push donations, refund Source of payment events.
SP-job-runner-generic pop job-requests Run from the job queue.
SP-listener-adyen push job-requests
delete by id pending
SP-adyen-process capture job get by id pending
push payments-antifraud
SP-adyen-record capture job get by id pending
push donations
SP-expiration-job archive by age pending - 20 days

limbospecial - 9

verified_damaged - 14

recurring_damaged - 14

refund_damaged - 14

donations_gc_garbage - 1

AMQ Old Message Consume. TODO: make intrisic to the messages.
legacy-listener-paypal push pending_paypal_recurring Old dirty bare PHP, still in service. Deprecate!
q2c-banner history pop banner-history Import banner history log-donation id correlations.
q2c-donations pop donations The main consumer is the queue2civi job, which reads from the donations queue and stores in our CRM database.
get by id pending Pull in completion message data and merge.
delete by id pending Drop completion message once consumed.
q2c-fredge pop payments-antifraud, payments-init Import statistics
q2c-generic reject move * to *-damaged Certain types of error cause a message to be shunted to a damaged stream, to be analyzed and possibly corrected.
q2c-recurring pop donations_recurring Q2C module
q2c-refund pop refund-notifications Q2C module to record refund notifications, marking the affected contribution
q2c-unsubscribe pop unsubscribe (mostly unused) Decouples the mailing list unsubscribe UI from the
DI-recurring-ingenico push payments-init Make monthly charges. TODO: Should we push to the donations queue as well, and even the recurring events stream if we failban?
crm-damaged-browser peek multiple *-damaged TODO: write human review policies.
push * (not damaged)
delete by id *-damaged
analytics query by all indexes contribution_tracking TODO: should be decoupled from frontend event log

Overhaul 2016[edit]

See the main article: Fundraising Queue Overhaul.