Analytics/Archive/Logging infrastructure/status

Last update on: 2012-06-03

2012-05-22
Will soon deploy new version of udp-filter that accepts a variable number of fields. This will allow us to migrate more custom C filters to udp-filter. udp-filter can now filter by HTTP response status, and geocode along side of IP address. 

2012-05-10
We have added a third log collector machine (oxygen) to supplement our current collectors (locke and emery). Andrew is working out a strategy for dealing with errant spaces in nginx logs that throw off our logging scripts. Also figuring out how to better match wikipedia-zero traffic; will probably add custom response header.

2012-05-monthly
Our plan to improve logging sources (Squid, Varnish, nginx, etc.) includes adding more fields, and also allowing us to add arbitrary fields in the future without breaking features. Changing the field formats of the logging sources requires coordination with the Operations team. The format changes have been committed, but not yet deployed. has been modified so that it is more flexible, and a few features have been added as well: it now can geocode and anonymize inline in the same field as the IP address, so that later log parsers don't have to try to detect a new field.

2012-06-03
During the Berlin Hackathon, a patch was submitted that allows udp-filter to do IPv6 address filtering. We hope to incorporate this soon.