Manual talk:$wgSpamRegex
Contents |
[edit] blocks all href links entirely
- Large Example shows on article:
"\<</span>\s*a\s*href|". # This blocks all href links entirely, forcing wiki syntax
in Source:
"\<\s*a\s*href|". # This blocks all href links entirely, forcing wiki syntax
So this is a parser issue? First will not work because of "/" as delimiter ends the regex. Fails with error "Unknown modifier 'p').
--Martin
- Are there other categories which this could/should go into? ex. security or spam protection?
Sy Ali 17:48, 19 April 2006 (UTC)
- On my MediaMiki, using the "Large Example," spam is getting through the regex for "overflow" by dropping the closing semi-colon. So, I deleted the semi-colon and that seems to be working (for now). It might be useful to others to remove it since it's not necessary. I can't, I tried (the spam protection used here won't let me save). Latrippi 02:28, 22 July 2006 (UTC)
[edit] blocking lots of links
Most wikispam I've encountered has taken the form [http://url/ keyword keyword] [http://url2/ keyword2 keyword2] etc. So, what about using this to block many links in a row? I'm thinking something like...
$wgSpamRegex = '/(\[http:\/\/[a-z0-9\.\/\%_-]+(\s+[a-z0-9\.-]+)+\]\s+){10}/i';
or
$wgSpamRegex = '/(\[http:\/\/[^\s]+\s+[^\[]+\s*?){10}/i';
Comments? (Handy PHP regex tester...)
--Alxndr 03:18, 26 November 2006 (UTC)
How does one stop the 'MediaWiki:Spamprotectiontext' telling the spammer what words just got banned and therefore rewording their spam to get passed it?
I'd love to know.
--Quatermass 20:43, 9 May 2007 (UTC)
- You can change that message in Special:Allmessages Jonathan3 18:09, 8 September 2007 (UTC)
- You read you can delete the "$1" on MediaWiki:Spamprotectionmatch in order to achieve that. w:User:JanCK10:52, 18 November 2007 (UTC)
[edit] log
Is there a log that shows how ofter the my mediawiki denies edits? w:User:JanCK00:56, 18 November 2007 (UTC)
[edit] $wgSpamRegex is not working in my wiki
Maybe someone can help me. I have configured the variable wgSpamRegex like Manual:$wgSpamRegex#A Large Example but if i try to test the filter with words of spam nothing happens. Is there something else to do? The version of mediawiki is 1.13.0. Thx! --88.65.198.156 18:24, 5 October 2008 (UTC)
- You can try Extension:SpamRegex. iAlex 18:35, 5 October 2008 (UTC)
- Thx - now it's working but the only problem is that I get a php warning if the spamregex filter alerts. Here the output from html
<b>Warning</b>: preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Delimiter must not be alphanumeric or backslash in <b>/../htdocs/includes/EditPage.php</b> on line <b>747</b><br />
What can I do against this output? Thx again!
- Is there nobody who has the same problem? --82.113.113.161 15:24, 13 March 2009 (UTC)
[edit] blocking by number of links
I have tried to add a limit for number of links to 15 as mentioned in the article, but am still able to add articles with more than 15 links. This is my regex in its entirety:
$wgSpamRegex = "/". # The "/" is the opening wrapper
"s-e-x|zoofilia|sexyongpin|grusskarte|geburtstagskarten|animalsex|".
"sex-with|dogsex|adultchat|adultlive|camsex|sexcam|livesex|sexchat|".
"chatsex|onlinesex|adultporn|adultvideo|adultweb.|hardcoresex|hardcoreporn|".
"teenporn|xxxporn|lesbiansex|livegirl|livenude|livesex|livevideo|camgirl|".
"spycam|voyeursex|casino-online|online-casino|kontaktlinsen|cheapest-phone|".
"laser-eye|eye-laser|fuelcellmarket|lasikclinic|cragrats|parishilton|".
"paris-hilton|paris-tape|fuel-dispenser|fueling-dispenser|".
"jinxinghj|telematicsone|telematiksone|a-mortgage|diamondabrasives|".
"reuterbrook|sex-plugin|sex-zone|lazy-stars|eblja|liuhecai|".
"buy-viagra|-cialis|-levitra|boy-and-girl-kissing|". # These match spammy words
"dirare\.com|". # This matches dirare.com a spammer's domain name
"overflow\s*:\s*auto|". # This matches against overflow:auto (regardless of whitespace on either side of the colon)
"height\s*:\s*[0-4]px|". # This matches against height:0px (most CSS hidden spam) (regardless of whitespace on either side of the colon)
"(http:.*){16}|". # ***** Limit total number of external links allowed per page / to 15 DOESN'T WORK!
"display\s*:\s*none". # This matches against display:none (regardless of whitespace on either side of the colon)
"/i"; # The "/" ends the regular expression and the "i" switch which follows makes the test case-insensitive
It does block the other expressions, but I can still save articles with more than 15 links! I don't see what I'm doing wrong, Please help...
- MediaWiki: 1.11.0
- PHP: 5.2.6 (cgi-fcgi)
- MySQL: 5.0.45-community-log
Thanks, Nathanael Bar-Aur L. 17:22, 7 October 2008 (UTC)
- PHP 5.2.x introduced pcre.backtrack_limit with default 100000 (less than 100K). I think that is too low and trips up the regex. See stronk7 at moodle dot org's 13-Sep-2007 comment (Find '13-Sep-2007') at http://us.php.net/manual/en/ref.pcre.php. Try adding the following line to LocalSettings.php:
ini_set( 'pcre.backtrack_limit', '8M' );
- I don't know what 'pcre.backtrack_limit' value is appropriate. 8M works for me and is lifted from paragraph 4 of Wikipedia's Perl Compatible Regular Expressions article intro. Someone who knows more please adjust that and comment. --Rogerhc 17:44, 9 November 2010 (UTC)
- It works for me only with (.|\n)*? <-- this part crosses line ends (\n) and is ungreedy (*?). Like this it works for me up to {129} on a long page with many 200 repetitions of "http://xxxxx " on it. With {130} or higher the server gave this error message: "503 Service Unavailable - The server is temporarily busy, try again later!". Try this:
$wgSpamRegex = "/(http:(.|\n)*?){101}/";
- --Rogerhc 05:03, 10 November 2010 (UTC)
[edit] Not working for me
i simply put the following line in my settings.
$wgSpamRegex = "/suyash jain/i";
but it is not working
Any help..
[edit] Profanity
Hey, anyone got any regex profanity checks out there?
[edit] Example blocks legitimate CSS
For example, if I were to type, "overflo:auto; height:" [with "overflow" instead of "overflo", "w" deleted by User:Rogerhc to get this through MwdiaWiki's current spam filter] I would not be allowed to save this page. Rocket000 08:05, 19 August 2009 (UTC)
- True and noted. And I had to change your comment to get it past MediaWiki's current spam filter. However, legitimate wiki edits probably don't need that particular CSS, and disallowing it helps stop spam. So it is useful on most wikis. --Rogerhc 18:08, 9 November 2010 (UTC)
[edit] Blocking all external links, working version:
$wgSpamRegex = "/^http:|^\[[^][]*\]$/";
What the articles says doesn't work on my wiki for some reason (v1.11). Instead this seems to do the job:
$wgSpamRegex = "/http:\/\//";
I wonder if this can cause any issue I'm not aware of? --Nathanael Bar-Aur L. 22:03, 25 September 2009 (UTC)
[edit] working on this page
I was working on this page, grammar, spelling etc, and moved this section to the talk page:
Occasionally spammers have openly discussed their behavior with the people who fight spam, and the people who are victims of it. From these discussions it's clear that they really believe they are not doing anything wrong. We should tell them otherwise. Edit your 'MediaWiki:Spamprotectiontext' page, and write a message to the spammers. It's better if it's your own words. If a spammer visits many different wikis and gets many different messages telling them to quit, who knows - maybe they'll start to think about what they are doing. It's probably better to keep the language reasonably polite. You are attempting to reason with them after all. Also remember your legitimate users might end up getting this message in the case of false positives. Example:
In many cases it's a waste of time, but it would be nice if just a few of these people put their talents to better uses. |
I think the last sentence sums it up, "In many cases it's a waste of time" most spam is from bots. This section seems more like a wishful polemic then instructions on how to use $wgSpamRegex. Errectstapler 04:03, 17 July 2011 (UTC)
[edit] My spamregex version
$wgSpamRegex = "/". # The "/" is the opening wrapper
"s-e-x|zoofilia|sexyongpin|grusskarte|geburtstagskarten|animalsex|".
"sex-with|dogsex|adultchat|adultlive|camsex|sexcam|livesex|sexchat|".
"chatsex|onlinesex|adultporn|adultvideo|adultweb.|hardcoresex|hardcoreporn|".
"teenporn|xxxporn|lesbiansex|livegirl|livenude|livesex|livevideo|camgirl|".
"spycam|voyeursex|casino-online|online-casino|kontaktlinsen|cheapest-phone|".
"laser-eye|eye-laser|fuelcellmarket|lasikclinic|cragrats|parishilton|".
"paris-hilton|paris-tape|2large|fuel-dispenser|fueling-dispenser|huojia|<strong>|".
"jinxinghj|telematicsone|telematiksone|a-mortgage|diamondabrasives|".
"reuterbrook|sex-plugin|sex-zone|lazy-stars|eblja|liuhecai|<strong>|==<center>|".
I added <strong> and ==<center> because our wikis are being attacked with spam using these words. Igottheconch 03:17, 28 November 2011 (UTC)