Extension:SpamBlacklist/es
SpamBlacklist Estado de lanzamiento estable |
|
---|---|
![]() |
|
Implementación | Acción de página |
Descripción | Proporciona un filtro de spam basado en expresiones regulares |
Autor(es) | Tim Starlingdiscusión |
Última versión | Actualizaciones continuas |
Política de compatibilidad | Ramas de liberación |
MediaWiki | 1.31+ |
Licencia | GNU Licencia Pública general 2.0 o más tarde |
Descarga | README |
|
|
|
|
Traduce el SpamBlacklist extensión si es disponible en translatewiki.net | |
Asuntos | Tareas abiertas · Reportar un bug |
The SpamBlacklist extension prevents edits that contain URLs whose domains match regular expression patterns defined in specified files or wiki pages and registration by users using specified email addresses.
When someone tries to save a page, this extension checks the text against a (potentially very large) list of illegal host names. Si hay una coincidencia, la extensión muestra un mensaje de error e impide guardar la página.
Instalación y configuración
Instalación
- Descarga y extrae los archivos en el directorio «
SpamBlacklist
» dentro del directorioextensions/
existente. - Añade el siguiente código a tu LocalSettings.php (preferiblemente al final):
wfLoadExtension( 'SpamBlacklist' );
- Configura la lista negra a tu gusto
Hecho – Navega a Special:Version en tu wiki para verificar que la apariencia se haya instalado correctamente.
Setting the block list
The following local pages are always used, whatever additional sources are listed:
- MediaWiki:Spam-blacklist
- MediaWiki:Spam-whitelist
- MediaWiki:Email-blacklist
- MediaWiki:Email-whitelist
The default additional source for a block list of forbidden URLs is the Wikimedia spam block list on Meta-Wiki, at m:Spam block list. By default, the extension uses this list, and reloads it once every 10-15 minutes. For many wikis, using this list will be enough to block most spamming attempts. However, since the Wikimedia block list is used by a diverse group of large wikis with hundreds of thousands of external links, it is comparatively conservative in the links it blocks.
The Wikimedia spam block list can only be edited by administrators; but you can suggest modifications to the block list at m:Talk:Spam blacklist.
You can add other bad URLs on your own wiki. List them in the global variable $wgBlacklistSettings
in LocalSettings.php . See examples below.
$wgBlacklistSettings
is an two level array. Top level key is spam
or email
. They take an array with each value containing either a URL, a filename or a database location.
If you use $wgBlacklistSettings
in "LocalSettings.php", the default value of "[[m:Spam blacklist]]" will no longer be used - if you want that block list to be accessed, you will have to add it in manually, see examples below.
Specifying a database location allows you to draw the block list from a page on your wiki.
The format of the database location specifier is "DB: [db name] [title]". [db name] should exactly match the value of $wgDBname
in LocalSettings.php. You should create the required page name [title] in the default namespace of your wiki. If you do this, it is strongly recommended that you protect the page from general editing. Besides the obvious danger that someone may add a regex that matches everything, please note that an attacker with the ability to input arbitrary regular expressions may be able to generate segfaults in the PCRE library.
Ejemplos
If you want to, for instance, use the English-language Wikipedia's spam block list in addition to the standard Meta-Wiki one, you could call the following in LocalSettings.php , AFTER wfLoadExtension( 'SpamBlacklist' );
call:
$wgBlacklistSettings = [
'spam' => [
'files' => [
"https://meta.wikimedia.org/w/index.php?title=Spam_blacklist&action=raw&sb_ver=1",
"https://en.wikipedia.org/w/index.php?title=MediaWiki:Spam-blacklist&action=raw&sb_ver=1"
],
],
];
Here is an example of an entirely local set of block lists: the administrator is using the update script to generate a local file called "wikimedia_blacklist" that holds a copy of the Meta-Wiki blacklist, and has an additional block list on the wiki page "My spam block list":
$wgBlacklistSettings = [
'spam' => [
'files' => [
"$IP/extensions/SpamBlacklist/wikimedia_blacklist", // Wikimedia's list
// database title
'DB: wikidb My_spam_block_list',
],
],
];
Incidencias
If you encounter issues with the block list, you may want to increase the backtrack limit. However on the other hand, this can reduce your security against DOS attacks, as the backtrack limit is a performance limit:
// Bump the Perl Compatible Regular Expressions backtrack memory limit
// (PHP 5.3.x default, 1000K, is too low for SpamBlacklist)
ini_set( 'pcre.backtrack_limit', '8M' );
Lista blanca
Se puede mantener una lista blanca correspondiente editando la página MediaWiki:Spam-whitelist. Esto es útil si deseas invalidar algunas entradas de la lista negra de otro wiki que estás usando. Wikimedia wikis, for instance, sometimes use the spam block list for purposes other than combating spam.
It is questionable how effective the Wikimedia spam block lists are at keeping spam off of third-party wikis. Some spam might be targeted only at Wikimedia wikis, or only at third-party wikis, which would make Wikimedia's blacklist of little help to said third-party wikis in those cases. Also, some third-party wikis might prefer that users be allowed to cite sources that are not considered reliable on Wikipedia, or that Wikipedia has considered so ideologically offensive as to warrant blocking. Sometimes what one wiki considers useless spam, another wiki might consider useful.
Users may not always realize that, when a link is rejected as spammy, it does not necessarily mean that the individual wiki they are editing has specifically chosen to ban that URL. Therefore, wiki system administrators may want to edit the system messages at MediaWiki:Spamprotectiontext and/or MediaWiki:Spamprotectionmatch on your wiki to invite users to make suggestions at MediaWiki talk:Spam-whitelist for pages that should be added by a sysop to the safe list. For example, you could put, for MediaWiki:Spamprotectiontext:
- The text you wanted to save was blocked by the spam filter. This is probably caused by a link to a blacklisted external site. {{SITENAME}} maintains [[MediaWiki:Spam-blacklist|its own block list]]; however, most blocking is done by means of [[metawikimedia:Spam-blacklist|Meta-Wiki's block list]], so this block should not necessarily be construed as an indication that {{SITENAME}} made a decision to block this particular text (or URL). If you would like this text (or URL) to be added to [[MediaWiki:Spam-whitelist|the local spam safe list]], so that {{SITENAME}} users will not be blocked from adding it to pages, please make a request at [[MediaWiki talk:Spam-whitelist]]. A [[Project:Sysops|sysop]] will then respond on that page with a decision as to whether it should be listed as safe.
Notas
- Esta extensión examina solamente los enlaces externos nuevos añadidos por editores del wiki. To check user agents, add Bad Behaviour or Akismet As the various tools for combating spam on MediaWiki use different methods to spot abuse, the safeguards are best used in combination.
- The Extension:SpamBlacklist/update script is a cron script that can automate updates from shared block lists.
If you are using memcached, you will also have to delete the spam_blacklist_regexes
key (for example, using maintenance/mcc.php
).
- No es posible dejar que algunos usuarios queden exentos de la lista negra de spam. Véase bugzilla:34928.
Uso
Sintaxis de la lista negra
Si deseas crear una lista negra personalizada o modificar una lista negra existente, aquí está la sintaxis:
Todo lo que aparezca en una línea después del carácter «#» se ignora (se usa para comentarios). Cualquier otra cadena de texto se interpretará como un fragmento de expresión regular que solamente se evaluará contra direcciones URL.
- Notas
- No añadas «http://». Provocaría un error, ya que la expresión regular solamente se evaluará después del «http://» (o «https://») en cada URL.
- Además, «www» tampoco hace falta, ya que la expresión regular se evaluará contra cualquier subdominio. Al indicar explícitamente «www\.», se puede evaluar contra subdominios específicos.
- Los anclajes
(?<=//|\.)
y$
coinciden con el principio y el final del nombre del dominio, no el principio y el final de la URL. El anclaje regular^
no tiene ningún uso. - No hace falta escapar las barras con barras invertidas. Esto lo hace automáticamente el script.
- Ejemplo
The following line will block all URLs that contain the string "example.com", except where it is immediately preceded or followed by a letter or a number.
\bexample\.com\b
These are blocked:
- http://www.example.com
- http://www.this-example.com
- http://www.google.de/search?q=example.com
These are not blocked:
- http://www.goodexample.com
- http://www.google.de/search?q=example.commodity
Rendimiento
La extensión crea una sola expresión regular similar a /https?:\/\/[a-z0-9\-.]*(línea 1|línea 2|línea 3|...)/Si
(donde todas las barras diagonales dentro de las líneas están escapadas automáticamente).
La almacena en un pequeño fichero «cargador» para evitar cargar todo el código cada vez que se visita una página.
Page view performance will not be affected even if you're not using a bytecode cache although using a cache is strongly recommended for any MediaWiki installation.
The regex match itself generally adds an insignificant overhead to page saves (on the order of 100ms in our experience). However, loading the spam file from disk or the database, and constructing the regex, may take a significant amount of time depending on your hardware. If you find that enabling this extension slows down saves excessively, try installing a supported bytecode cache. This extension will cache the constructed regex if such a system is present.
If you're sharing a server and cache with several wikis, you may improve your cache performance by modifying getSharedBlacklists and clearCache in SpamBlacklist_body.php to use $wgSharedUploadDBname (or a specific DB if you do not have a shared upload DB) rather than $wgDBname. Be sure to get all references! The regexes from the separate MediaWiki:Spam-blacklist and MediaWiki:Spam-whitelist pages on each wiki will still be applied.
Servidores externos de listas negras (RBL)
In its standard form, this extension requires that the block list be constructed manually. While regular expression wildcards are permitted, and a block list originated on one wiki may be re-used by many others, there is still some effort required to add new patterns in response to spam or remove patterns which generate false-positives.
Much of this effort may be reduced by supplementing the spam regex with lists of known domains advertised in spam email. The regex will catch common patterns (like "casino-" or "-viagra") while the external block list server will automatically update with names of specific sites being promoted through spam.
In the filter() function in SpamBlacklist_body.php, approximately halfway between the file start and end, are the lines:
# Do the match
wfDebugLog( 'SpamBlacklist', "Checking text against " . count( $blacklists ) .
" regexes: " . implode( ', ', $blacklists ) . "\n" );
Directly above this section (which does the actual regex test on the extracted links), one could add additional code to check the external RBL servers:
# Do RBL checks
$retVal = false;
$wgAreBelongToUs = ['l1.apews.org.', 'multi.surbl.org.', 'multi.uribl.com.'];
foreach( $addedLinks as $link ) {
$link_url=parse_url($link);
$link_url=$link_url['host'];
if ($link_url) {
foreach( $wgAreBelongToUs as $base ) {
$host = "$link_url.$base";
$ipList = gethostbynamel( $host );
if( $ipList ) {
wfDebug( "RBL match: Hostname $host is {$ipList[0]}, it's spam says $base!\n" );
$ip = wfGetIP();
wfDebugLog( 'SpamBlacklistHit', "$ip caught submitting spam: {$link_url} per RBL {$base}\n" );
$retVal = $link_url . ' (blacklisted by ' . $base .')';
wfProfileOut( $fname );
return $retVal;
}
}
}
}
# if no match found on RBL server, continue normally with regex tests...
This ensures that, if an edit contains URLs from already blocked spam domains, an error is returned to the user indicating which link cannot be saved due to its appearance on an external spam block list. If nothing is found, the remaining regex tests are allowed to run normally, so that any manually-specified 'suspicious pattern' in the URL may be identified and blocked.
Note that the RBL servers list just the base domain names - not the full URL path - so http://example.com/casino-viagra-lottery.html will trigger RBL only if "example.com" itself were blocked by name by the external server. The regex, however, would be able to block on any of the text in the URL and path, from "example" to "lottery" and everything in between. Both approaches carry some risk of false-positives - the regex because of the use of wildcard expressions, and the external RBL as these servers are often created for other purposes - such as control of abusive spam email - and may include domains which are not engaged in forum, wiki, blog or guestbook comment spam per se.
Otras herramientas antispam
There are various helpful manuals on mediawiki.org on combating spam and other vandalism:
- Anti-spam features - includes link to the built-in $wgSpamRegex anti-spam mechanism.
- Combating spam
- combating vandalism
Other anti-spam, anti-vandalism extensions include:
Véase también
- Listas negras compatibles (esta es una pequeña muestra, hay muchas más)
- Otros recursos
- Combating spam and combating vandalism .
![]() | Esta extensión está siendo usada en uno o más proyectos de Wikimedia. Esto significa probablemente que la extensión es estable y funciona lo suficientemente bien como para ser usada en sitios con gran cantidad de visitas. Puedes buscar el nombre de esta extensión en los archivos CommonSettings.php e InitialiseSettings.php de Wikimedia para ver dónde se instala. Encontrarás la lista completa de extensiones instaladas en un wiki en particular en la página Special:Version del wiki. |
- Extensions bundled with MediaWiki 1.21/es
- Stable extensions/es
- Page action extensions/es
- GPL licensed extensions/es
- Extensions in Wikimedia version control/es
- EditFilter extensions/es
- EditFilterMergedContent extensions/es
- PageSaveComplete extensions/es
- ParserOutputStashForEdit extensions/es
- UploadVerifyUpload extensions/es
- UserCanSendEmail extensions/es
- All extensions/es
- Extensions used on Wikimedia/es
- API extensions/es
- Edit extensions/es
- Spam management extensions/es
- Extensions for data exchange with other wikis/es