Extension talk:StopForumSpam

Global blocks
Very interesting! How does this extension interact with Extension:GlobalBlocking, can data be submitted also upon global blocks? Can one use the extension in "write only" mode, i.e. only to submit data but not to use it? If yes it could be used by stewards/global sysops on Meta-Wiki to contribute to the list even though we initially not use it on Wikimedia projects. --Nemo 09:42, 16 December 2013 (UTC)
 * The main issue is that stopforumspam requires any data to be submitted to have an email, username, and IP address (see bottom of ). So it won't work with GlobalBlocking since there would be no username or email address. Write only mode would be possible, it just requires a config option to disable AbuseFilter integration. Legoktm (talk) 17:34, 16 December 2013 (UTC)
 * Thanks. Hm, I don't get it. Can't GlobalBlocking use some generic email/username/key configured with a global? On Meta, the email could be some service address or the stewards' OTRS queue. --Nemo 17:54, 16 December 2013 (UTC)
 * No, we don't want people to think "stewards@wikimedia.org" is a spam email. I also found, which clearly states that you need a username+email+IP. Legoktm (talk) 21:50, 16 December 2013 (UTC)
 * What do you mean "a spam email"? That forum post clarifies that the address is just one where it's possible to get answers from; settings this to some shared email for the wiki admins/whatever looks to me like something many wikis would like to do. Anyway, it's not so important. --Nemo 08:11, 17 December 2013 (UTC)

Performance
The StopForumSpam blacklist is very long, there are reports of wikis severely slowed down (to the point of becoming unusable) when enabling such huge blacklists, because bots produce a gazillion hits and the cost for each of them was too high. It would be interesting to have some guesstimate on the performance effects/CPU costs of this extension. --Nemo 09:42, 16 December 2013 (UTC)
 * The IP blacklist is loaded into shared memory (e.g. memcached, redis, APC, etc.) and checked against, so the cost of checking a single bot is the cost of fetching an item out from that cache (the IP blacklist will not function if $wgMainCacheType is set to CACHE_NONE). Updating that cache is another story entirely, as it can be quite slow, which is why we made a maintenance script to do it. By default, whenever it detects the cache is out of date, it will load a DeferredUpdate to run at the end of the current request, which in my testing on commodity hardware takes around 5-10 seconds (should be less on faster hardware, and legoktm reported that it took around 50 seconds on his test VM). You can disable this behavior however and only update the blacklist via the updateBlacklist.php maintenance script -- it won't run any faster but at least it offloads that work to not the web request -- running that script via cron is the recommended setup. Another performance impact I can forsee is when a the 'sfs-confidence' variable is used in an AbuseFilter, as this will cause an API hit to the StopForumSpam website. However, the result of that is also cached for a period of one day, so the same bot doing multiple hits on pages with that AbuseFilter variable should only trigger some slowness/extra strain once. -- Skiz zerz  17:16, 16 December 2013 (UTC)