Extension talk:AbuseFilter/Archive 1

Filters have no effect
I'm having crazy problems with the AbuseFilter extension, and I can't get any filters to do anything at all. No matter what kind of actions I try to filter out, they get through without the Abusefilter doing anything to stop them, it doesn't even log the events in the abuse log. Even running it in test mode and comparing all the old edits, nothing matches the filter. For example, the "shouting" filter which is intended to catch anyone typing in BLOCK CAPITAL LETTERS, completely ignores anyone who has been doing it.

It's blind.

The filters I've been trying are the ones wikipedia itself uses, so they should definitely work. For some reason it just doesn't seem to be paying attention to any of the edits though.

I have two other extensions installed, AntiSpoof and "bad behaviour"... Apparently AntiSpoof is required before you can install AbuseFilter, so I doubt there could be any conflict there. Bad behaviour is working effectively, so effectively that it wouldn't allow me to install AbuseFilter to begin with because it seemed to think the installer was a bot or something? So I had to temporarily disable it in order to get AbuseFilter installed. Bad behaviour is still currently disabled so I don't think there could be any conflict there either.

I'm on mediawiki 1.16, the latest one. My extensions are all also the versions which go with 1.16. The server is Apache/Linux. PHP 5.2.42 and MySQL 5.0.85.

Has anyone heard of this happening before, or got even the slightest clue what the problem could be? Even the faintest idea or the wildest guess would be helpful, anything to give me something to look into. I've got nothing. Asndb 02:26, 13 September 2010 (UTC)

Possible postgres problem
I try to install AbuseFilter on a fresh wikifamily. But when I try to access the Special:AbuseFilter page I get this error:

Warning: pg_query [function.pg-query]: Query failed: FEHLER: Spalte »abuse_filter.af_enabled« muss in der GROUP-BY-Klausel erscheinen oder in einer Aggregatfunktion verwendet werden in /mediawiki-data/ mediawiki-1.15.3/includes/db/DatabasePostgres.php on line 5

The required tables where created with this command (this errors are here because I executed it twice, I think): php ../mediawiki-extensions/AbuseFilter/install.php --conf LocalSettings.php Warning: pg_query: Query failed: FEHLER: Relation »abuse_filter_af_id_seq« existiert bereits in /var/www/mediawiki-data/mediawiki-1.15.3/includes/db/DatabasePostgres.php on line 580 A database error has occurred Query: CREATE SEQUENCE abuse_filter_af_id_seq

Function: Database::sourceStream Error: 1 FEHLER: Relation »abuse_filter_af_id_seq« existiert bereits

Is there something missing? Do I have forgot something? nuss0r

Syntax error
I have installed this extension on a localhost Wiki but for some reason it rejects any filter I add and gives the error message: "There is a syntax error in the filter you specified. The output from the parser was:" And it gives no output from the parser. I know the syntax is correct because even simple things like "testing in SUMMARY" are rejected. Is this a bug, or have I done something wrong? 125.238.97.240 11:11, 14 September 2008 (UTC)

Throttling
What does "# editcount — Edit count — hack so that you can detect distinct users." mean? &mdash; Mike.lifeguard &#124; @meta 00:04, 20 February 2009 (UTC)

Some problems
There are a few problems with this extension
 * Actions only visible to the editor should not have public logging : Logging is not something where one level fits all, especially when it comes to public verbal attack on living persons. As a first approach access to the log items can be given on an action level, but probably this will also lead to a lot of problems. At least there should be possible to turn of logging for trivial actions which never give any public effect. For example, if a warning is given and the editor then avoid storing the edit, then no logging should happen. If he continue the upload of the edit, then the warning can be logged. Still note that the logging itself can release information about the topic described in the edit.
 * Public rules could itself be a violation of privacy given specific articles : Imagine someone gets some information that s/he assume could lead to legal action if published. This information is then used to write one or more rules for prohibiting said information from reaching a wiki. If this information is clear text then it can be easily inverted into the information that should be excluded. This happens most typically when the article in question is a biography and the rule is about some kind of rumor or fact, especially when such rumors or facts are illegal to publish. One solution is to make special digests instead of clear text in the rules.
 * Regex patterns are overly simplistic for analyzing real life vandalism : As long as the vandalism can be described by simple patterns matching singular markers thing works out pretty well. If the vandalism isn't simple words but phrases, then it becomes a lot more troublesome to detect such vandalism. Some previous work indicates that whole phrases should be analyzed anyhow, and that an approach with simple words are to simplistic, and will lead to a much to high number of false positives.

Probably there are other problems, but those are very close to show stoppers. Jeblad 20:26, 22 February 2009 (UTC)

You have obviously not examined the extension in detail. The last two are obvious non-issues, the second because rules can be hidden from the general public, and the third because the whole point of the extension is that it goes beyond regexes to more complicated boolean logic based on far more context than just a regex match.

The first is something that needs to be debated – it may be worthwhile offering this as a configuration option. Andrew Garrett 03:32, 23 February 2009 (UTC)


 * Limiting public disclosure to an smaller unidentified group is not good enough, ie you can not guarantee that it will not be diclosed by someone because you can't identify the persons having the opportunity to inspect the actual information (in Norway that is «behandlingsansvarlig» for such information). It seems to me that the only viable solution is to either create a limited group of identified persons, as for the OTRS-system on Wikimedia, or to make encrypted rules that is identifiable but which can not be read in clear text. Perhaps it is acceptable in some countries, but in those countries where you must identify who has access it can create some real problems. The same goes for logging, as logging and rules are symmetric in this respect.
 * A similar problem is to protect the rules itself from probing. Imagine a rule guarding an article from inclusion of a rumor about someone being raped, and then a reader wants to verify the rumor so he inserts some text with the word "rape" in it. If he is blocked, or even just given a warning, then he know that the rumor is known and the information is leaked by the action itself. The system must have some capability to obfuscate the actual action and the reason behind. One such method is to trigger on not only clear text patterns but on digests of word(s) where the digest algorithm has a low entrophy. This will make it difficult to reverse engineer the words, and also give an increased number of false positives. If such a rule is only used for blocking upload to a single article it would not pose a problem as long as it does not involve blocking of the editor himself.
 * I know that there is a system for boolean logic, but this still does not solve the complexity of parsing natural language. Without such capabilities the error rate will be far to high. Compare the two statements "He is a monkey" vs "It is a monkey". Jeblad 11:41, 23 February 2009 (UTC)
 * Furthermore, "He is a monkey" is just fine in, say, "Curious George" on Wikipedia or other pages about fictional characters. --Damian Yerrick 13:44, 2 July 2011 (UTC)

Proximity operators
If the solution is to be used for text analysis, then proximity operators should be added. Such operators typically act within some logical text unit such as a paragraph, a sentence or similar, or within a number of words. Jeblad 20:56, 23 February 2009 (UTC)

Beta status
The trials on en:wp: prove that the abuse filter works, could the status be changed to beta?--Ipatrol 00:21, 19 March 2009 (UTC)

Documentation?
Greetings. I am a sysop at en.wikiquote, where there is some interest in employing this extension. Is there self-contained documentation available or forthcoming, or is this tool intended to be used only by persons who are conversant with its run-time environment? ~ Ningauble 16:52, 21 March 2009 (UTC)(talk@wq)

Documentation shortcomings
It says
 * The abuse filter passes various variables by name into the parser.

Those need to be documented. AxelBoldt 17:36, 23 March 2009 (UTC)
 * Yes, please! ~ Ningauble 15:42, 24 March 2009 (UTC)(talk@wq)

Installation is hard
I cant execute command line scripts due to shared hosting so I cant run update.php or install.php. Other extensions are very easy to install: only the 'include' is required in LocalSettings.php. This one involves a little more. --Kenny5 02:29, 30 March 2009 (UTC)


 * If you can't run scripts how did you get MediaWiki installed in the first place? 86.149.15.167 15:30, 10 April 2009 (UTC)
 * Um, you can. All you need to do is create a database, user and let the installer run its own script after you FTP the files. You dont need command line access. --Kenny5 16:41, 10 April 2009 (UTC)

What to do in case of "unexpected T_STRING" error
Individuals running maintenance/update.php or ~/YOURDOMAINNAME.com/extensions/AbuseFilter/install.php from the command line may encounter the following error: syntax error, unexpected T_STRING, expecting T_OLD_FUNCTION or T_FUNCTION or T_VAR or '}' in ~/mainteance/commandLine.inc on line 13

This error occurs when update.php or install.php is run from php4.

Individuals who have their site hosted by providers whom provide both php4 and php5 should take the following steps:
 * 1) from the command line, enter the command 'whereis php5'
 * 2) once you have discerned the location of the php5 path, list the director of the php5/bin directory
 * 3) once you've determined the name of the php executable (either php or php5), then type in the entire path to execute install.php

Below is an example: $ whereis php5 $ ls -la ls /usr/local/php5/bin $ /usr/local/php5/bin/php install.php

DeanPeters 16:15, 15 August 2010 (UTC)

Tags
How do I create stylable tags, so I can have items with a certain tag appear in a different colour or similar? The tags appear in Special:Tags, but how do I make them appear in a different color in recent changes? --Petter 09:07, 5 April 2009 (UTC)

Documentation
Note to werdna: Can you please expand and update the documentation. 04:51, 21 April 2009 (UTC)

Adapt I/O for augmenting templates?
This seems to be a more reasonable language for transclusion templates than the traditional parser expansion.2009/06/on-templates-and-programming-languages/.

For that to work, there is a need to take input from template parameter arguments and emit generated wikitext. I believe that this is related to r50997 and/or r51497 but I'm not entirely sure how. Best of luck! 99.22.93.63 18:18, 1 August 2009 (UTC)

How to import Wikipedia abuse filter?
Want to use it in my local wiki.

--Nettroll 12:20, 3 September 2009 (UTC)


 * When you have installed the extension, go to w:en:Special:AbuseFilter, choose a filter (say w:en:Special:AbuseFilter/3), then click "Export this filter to another wiki", copy the text, go to Special:AbuseFilter/import on your wiki, paste the text. --Nemo bis 12:26, 3 September 2009 (UTC)

MediaWiki version
I would like to ask which MediaWiki version is needed for this extension? Thanks. 77.20.39.5 18:36, 13 December 2009 (UTC)
 * It appears that it works for 1.13+, just grab the correct version of the extension for your version of mediawiki (see the dropdown here) -- Skiz zerz  20:04, 13 December 2009 (UTC)

Images & Special characters on same page producing fatal error in AbuseFilter
I have encountered the following fatal error when I try to include both images and special characters on the same page while having Abuse Filter switched on: Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 24 bytes) in $IP/extensions/AbuseFilter/AbuseFilter.i18n.php on line 12320 In the meantime, I am disabling Abuse Filter but would like to re-enable it should this be a simple matter to resolve. Does anyone have any ideas that might help me out? (I replaced the full address of the error above with $IP for security reasons) --Ds093 23:10, 10 February 2010 (UTC)


 * Check your memory limit in LocalSettings. You may want to raise it.

ini_set( 'memory_limit', '64M' );
 * 1) If PHP's memory limit is very low, some operations may fail.
 * --Subfader+


 * Thank you Subfader. I actually happened to notice that for myself just a few hours ago.  I meant to come back here and mention that I had myself then, but I had to go out and just got back.--Ds093 23:54, 13 February 2010 (UTC)

Conflicts with TemplateFormEditor
This extension does not work when the TemplateFormEditor is installed becouse for some reason it does not trigger the abuse filter hooks. Does anyone have had the same issue ???

--abel406 23:04:19, 02 June 2010 (UTC)

AbuseFilter actions
I'm evaluating Extension:AbuseFilter for use on our internal wiki, and trying to learn how it all works. When configuring a filter, under Actions taken when matched, I see a Flag the edit in the abuse log checkbox, but it's disabled (greyed out) so I can't click on it to clear the checkbox, and I can't figure out how to enable it. Is there any way to enable this option so I can clear the checkbox and prevent posting a triggered event to the Abuse log?

--Obliquemotion 18:18, 11 June 2010 (UTC)


 * It's probably intentional this way, see en:Wikipedia talk:Edit filter/Archive 5. —AlexSm 16:07, 2 August 2010 (UTC)

Only one edit per page
I'd like to know if it is possible to use some coding for the filter to non-autopatrolled users to one edit per page per, say, five minutes. I can't find any functions to do so, though. Could someone please help me out with this? (To clarify: I want them to be able to edit as much as they please but to be limited to one edit per specific page. They can make like thirty edits as long as they are made to different pages.)173.168.217.112 01:22, 1 August 2010 (UTC)


 * This is not a place to ask this, and such a filter looks very strange to me, but the answer is: use . —AlexSm 16:07, 2 August 2010 (UTC)
 * How is it not a place to ask? I'm discussing the usage of the extension. Anyways, I appreciate it.173.168.217.112 17:46, 2 August 2010 (UTC)
 * You can't really expect a reply here (my response was more of an exception). You might have better luck asking questions at enwiki AF page. —AlexSm 14:26, 3 August 2010 (UTC)


 * I'd be interested in creating a similar filter, but my attempts to use the throttling options to limit by  haven't worked as expected (or at all). Can anyone show a working example, or give any tips? Thanks! Adamcox82 17:53, 2 August 2010 (UTC)
 * Throttling is tricky indeed. Try to look at existing filters, e.g. get a |!private&abflimit=max list of enwiki public filters, look for "throttle" and then look at the code: en:Special:AbuseFilter/80, etc. —AlexSm 14:26, 3 August 2010 (UTC)
 * Throttling fails completely when attempting to use it. Adam and I have attempted the same thing, and the filter has matched the edits, but the throttle refuses to trip. I'd like to know why.Neo of ZW 18:16, 6 August 2010 (UTC)

Examples of Filters
After successfully installing AbuseFilter Extension, I wasn't able to really use the examples in the extenion's WIKI page.

I also noted some others asking for documentation. So I'm going to start a list here of filters to encourage others to collaborate on filters they've implemented.

Content
-- DeanPeters 16:59, 15 August 2010 (UTC)
 * Newly entered, formatted content: "illegal word or phrase" in new_wikitext
 * New title or newly entered content, filtered & unfiltered, "testfilter" in new_wikitext | "testfilter" in new_text | "testfilter" in lcase(article_text)
 * User adds more than 3 external links in a single edit count("http://", added_links) > 3


 * Thanks - just what I needed. I was looking for a way to block all links from new users - the spambots are overrunning the CAPTCHA defences...!


 * The last one should include https:// links as well (just in case - not sure if they're actually common in spam). I tried count("https?://", added_links) > 0 - but it didn't work. (The s? in regex means an optional s, so it should match http and https links, but somehow regex isn't working here...?)


 * So I tried this Boolean match, to find both http and https, and it worked: count("http://", added_links) > 0 | count("https://", added_links) > 0 as a way to  --Chriswaterguy 12:02, 7 December 2011 (UTC)

Problem with Class 'Html' not found Faltal Error
I have just svn AbuseFilter in my wiki but I get the following Fatal Error at Special:AbuseFilter Fatal error: Class 'Html' not found in [...]/w/extensions/AbuseFilter/Views/AbuseFilterViewList.php on line 91 — Chlewey 19:31, 17 November 2010 (UTC)


 * I changed  in line 91 for  .  Now I get the following code:

 ( |  |  |  |  |  | )



 &lt;abusefilter-list&gt; &lt;abusefilter-list-options&gt; 	 &lt;abusefilter-list-options-deleted-show&gt; &lt;abusefilter-list-options-deleted-hide&gt; &lt;abusefilter-list-options-deleted-only&gt; 	 &lt;abusefilter-list-options-hidedisabled&gt; <abusefilter-list-limit>
 * — Chlewey 20:46, 17 November 2010 (UTC)


 * Trying to import a filter, I got the following error:

Fatal error: Call to undefined function wfobjecttoarray in [...]/w/extensions/AbuseFilter/Views/AbuseFilterViewEdit.php on line 758
 * — Chlewey 20:49, 17 November 2010 (UTC)

Global Abuse Filter
This is in the config's table on the extension page > $wgAbuseFilterIsCentral > "Set this variable to true for the wiki where global AbuseFilters are stored in (if you're using global filters)."

I can't find anything on "GlobalAbuseFilter"

Mlpearc  powwow  01:58, 19 September 2011 (UTC)
 * Preliminary code for a global abusefilter has been developed somewhere... it hasn't been worked on due to some initial opposition to it (from what I've heard), though I'm trying to find out more. Ajraddatz 20:12, 5 February 2012 (UTC)

Apply filters
Hi,

this extensions works fine for modifications on my wiki. But I'd like to apply the rules to the changes that have been made before I installed the AbuseFilter. There is a way to test the filters, but I didn't find a "apply changes" button or so.

Best regards --Schubi87 10:48, 7 November 2011 (UTC)

Xml:hidden
This extension uses the Xml::hidden method, wich is deprecated. If using 1.18, this extension won't work. Solution is simple: replace all Xml::hidden occurrences for Html::hidden.

PS: A quick way to do this is to execute this command in a shell, in the AbuseFilter directory:

find ./ -type f -exec sed -i 's/"Xml::hidden"/"Html::hidden"/' {} \;

(it should work for any extension with this problem, tho) --Krusher 12:26, 29 November 2011 (UTC)

Logic tips for non-coders
Some tips for non-coders on how the Boolean works would be good. E.g.:
 * Logic carries over between line breaks.
 * The & (AND) operator is applied before | (OR).

Therefore if you want to flag and edit that matches a user condition either A or B, and also matches edit condition C or D, then you need brackets to group the logical statements:

(A | B) &(C | D)

If you leave these out, like this: A | B &C | D ...this can lead to false positives.

E.g. to stop brand new users posting a url, you might use:

(user_age < 2*3600 | user_editcount < 1) &count("http://", added_links) > 0 | count("https://", added_links) > 0

The brackets are important!

Maybe someone wants to check what I've written here, correct as needed, and put it on Extension:AbuseFilter/RulesFormat? --Chriswaterguy 04:10, 23 December 2011 (UTC)

Verbose log?
I'd love to have a more verbose form of the log. I don't want to click through to check 100 different blocked edits, but if the first 100 or 500 words of the attempted diff could be displayed in the log, I could more easily scan for false positives.

Thanks for the extension by the way - it's been a fantastic help on Appropedia. --Chriswaterguy 04:21, 23 December 2011 (UTC)

SQLite database error
When I try to install MediaWiki with the AbuseFilter extension, a syntax error occurs when parsing abusefilter.tables.sql (a MySQL dump file) with SQLite and installation of AbuseFilter was skipped. I copied AntiSpoof into the extensions directory because AbuseFilter "requires AntiSpoof" but did not install MediaWiki with AntiSpoof because trying to install AntiSpoof crashes the installer because of an unhandled SQLite syntax error when parsing the MySQL dump file for the AntiSpoof tables. An SQL dump of my fresh wiki is shown below so the developer can experiment with it and attempt to write a working SQLite database dump in reply.

(SQLite dump excised to /SQLite dump so it doesn't bog down this page with horrible load times Dr ishmael (talk) 16:50, 16 March 2012 (UTC))

Additional information: MediaWiki 1.18.1, PHP 5.3.10

Disallow + block
I'm trying to set up a spambot prevention that disallows obvious spam edits and blocks the user responsible, giving a warning first. However, the filter only seems to apply the disallow part and only randomly choose to block the user as well. When I removed the disallow consequence, the next test edit was both disallowed and the account blocked, the next test edit passed after the warning without consequences. Am I doing something wrong, or is the block consequence buggy? --Sovereign92 (talk) 21:22, 27 February 2012 (UTC)


 * I'm no expert, but I haven't experienced this. Are you able to share your filter code? --Chriswaterguy (talk) 06:02, 9 March 2012 (UTC)

article_namespace != 1|2|3 & !("confirmed" in user_groups) & (contains_any(new_text, "test123", "test321", ))

warn, disallow, block, tag

--Sovereign92 (talk) 17:59, 9 March 2012 (UTC)


 * Try enclosing the <tt>1|2|3</tt> in brackets. I get odd behavior in testing logic of the form <tt>article_namespace != 1|2|3 &</tt>  - but it stops when I change it to the form  <tt>article_namespace != (1|2|3) &</tt>


 * My next thought would have been to add quotation marks, i.e. "(1|2|3)" - but that's probably unneeded. Hope that helps. --Chriswaterguy (talk) 20:23, 9 March 2012 (UTC)

user_age in "Examine individual changes"
I found a confusing thing. A spammer registered, then 8 minutes later created a page. So user_age should be approximately 8*60. But when I |gpa%3F%29\b%22 examine the edit, it gives the user_age as the current age, rather than the age when they made that individual change.

I spent quite a lot of time trying to debug a filter before I realized what was going wrong. Hope this helps someone else. --Chriswaterguy (talk) 19:19, 3 March 2012 (UTC)

A few more options...

 * Bot-related
 * Prevent bots from performing this action (to exempt legitimate bots, add '!"bot" in user_groups') - If the spammer/vandal uses index.php to perform the action, ConfirmEdit is installed, and the ConfirmEdit options in LocalSettings.php are set correctly, tell ConfirmEdit to trigger a CAPTCHA if installed. If the spammer/vandal is using some other PHP program on the server (eg. api.php) to perform the action with MediaWiki, disallow the action which triggers this filter.
 * Warnings and disallow information for bots - gives the information about the triggered filter IDs and unformatted warning messages or disallow information (for example, <abusefilter-warning id="58">Be aware that, if you post your email address publicly, this can increase spam sent to your email address. If you want people to email you, tell them to use Special:EmailUser</abusefilter-warning> if actions taken are "Warn", <abusefilter-error id="113">You have tried to replace an article with nonsense and this has been disallowed.</abusefilter-error> if actions taken are "Warn, Disallow", or <abusefilter-error id="276" /> if actions taken are "Disallow".
 * Autoreverted revision (edit only) - perform the edit but see the revision automatically reverted by "Abuse filter" unless an exempted user group in the filter performs the edit

Comments in code
Is there a way of adding comments in the code? I read that <tt>(?#COMMENT)</tt> works in Python regex (and this is confirmed in this page, which was linked from a guide to PCRE syntax) but I can't make this work in AbuseFilter.

I tried this simple test filter:

user_age < 3600 (?#COMMENT)

and it gives the error "Syntax error detected: Unexpected "T_BRACE" at character 15."

Putting it on a separate line or using & or + didn't work either. Any clues? --Chriswaterguy (talk) 12:00, 9 March 2012 (UTC)


 * The ability to comment the filter code would be a wonderful addition to this extension. I have trouble remembering what a complex filter is actually doing when I look at it even a couple months later.  I know there's a "Notes" box, but that is a poor substitute for in-line comments.  Dr ishmael (talk) 16:52, 16 March 2012 (UTC)

How can I use createaccount?
I'd love a way to reduce the creation of accounts by spam bots - is there something I can do when the action is ?

I'd love to have a custom field as a trap for spam bots - either a super-simple CAPTCHA (e.g. enter the word  or   - the latter one hidden by CSS). Then AbuseFilter could check the inputs for those boxes. I assume that would take some hacking of AbuseFilter, but not sure how much.

Is there anything we can do with createaccount without hacking? Thanks --Chriswaterguy (talk) 17:52, 3 April 2012 (UTC)

Database error
I keep getting this message when I try to use this extension: A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:

(SQL query hidden)

from within function "IndexPager::reallyDoQuery (AbuseLogPager)". Database returned error "1146: Table 'my_wiki.abuse_filter_log' doesn't exist (localhost)". What should I do?--Breawycker (talk) 01:39, 6 April 2012 (UTC)
 * You have forgotten to run update.php.--Jasper Deng (talk) 01:40, 6 April 2012 (UTC)

Username whitelist on registration
Due to this spam bot problem, I'm planning to make a whitelist of allowed usernames, however, my code doesn't work as supposed and blocks registering altogether.

action = 'createaccount' & accountname != ("Test1 | Test2")

I have tried with and without brackets. --Sovereign92 (talk) 14:43, 6 April 2012 (UTC)