Alternatives to applying the same nofollow/dofollow value to all external links
This is a page for listing the pros and cons of alternatives to applying nofollow to all external links. Presently, nofollow is controlled by the boolean $wgNoFollowLinks. This causes both useful and spammy external links to be treated the same way, which is undesirable if we want to deter spam while encouraging users to add useful external links and helping the wiki improve the pagerank of sites its community wants to attract traffic to. However, the status quo does have the advantage of being simple and requiring little effort on the part of developers and system administrators, compared to what might be required with a more complicated system.
- 1 Alternatives
- 1.1 Make more use of interwiki links
- 1.2 Make more use of the spam whitelist and/or blacklist
- 1.2.1 Let non-sysops modify the whitelist and/or blacklist
- 1.2.2 Develop a FlaggedRevs-like system for reviewing external links for spamminess
- 1.2.3 Implement different levels of whitelisting and blacklisting
- 1.3 Implement "reverse linkrot"
- 1.4 Apply nofollow only to external links added in unpatrolled edits
- 2 See also
- 3 References
Nofollow is not applied to interwiki links. Trusted users could be given the right to edit the interwiki table and add non-spammy sites to it. We could develop capability for more efficiently handling the management of large numbers of interwiki links. E.g. allow mass adds of interwiki links borrowed from other wikis. See Extension:InterwikiMap for a tool (of limited scalability) that uses the API to do this.
This would not require much, if any, additional programming.
It would become a two-step process to add dofollow external links for URLs not already in the interwiki table — (1) add the URL to the interwiki table and (2) add the link to the article. If users are going to be using interwiki prefixes for these links (e.g. foo:bar instead of http://foo.com/bar or bar), they will need to check and see what prefixes are available on the wiki at which they're editing; this could be a hassle. Also, if only sysops will be allowed to add these interwiki prefixes (as is the case on most wikis), then it's possible not many prefixes will end up being added, compared to what would be the case if users could act on their own initiative. On the other hand, opening up the right of non-sysops (whether logged-in, autoconfirmed, or in the Editor group) to add, remove or modify interwikis using Extension:Interwiki as it's presently written is usually not done (why not? Is there a valid reason or is it just custom/habit?)
Interwiki patterns also do not support generic URLs very well. They are primarily intended for other wikis and hence MediaWiki currently does some encoding of the value after the prefix: in ways that make it unusable for adding new sites generically, for example see what it does to a google: interwiki link. See bugzilla:57054.
Make more use of the spam whitelist and/or blacklist
Let non-sysops modify the whitelist and/or blacklist
Non-sysops, perhaps subject to some other screening criterion (e.g. logged-in, autoconfirmed, or in the Editor group), could be allowed to modify the spam whitelist and/or blacklist. Dofollow would be applied to whitelisted (or non-blacklisted) pages. The whitelist could be configured to, e.g., (1) not warn the user when he is saving a page with non-whitelisted URLs; (2) warn him, but not prohibit him from saving; (3) require him to pass a CAPTCHA when saving a page with non-whitelisted links; or (4) prevent him from saving the page until he adds the URL to the whitelist.
Users would be able to act on their own initiative, allowing more URLs to be added to the whitelist than would be the case if the right to modify the whitelist were restricted to sysops.
It would become be a two-step process to add dofollow external links for URLs not currently in the whitelist: (1) add the URL to the whitelist and (2) add the link to the article. A user might not know off the bat whether a URL is in the whitelist or not, unless the system were to tell him (as it could be configured to do); it could be a hassle to check. Some users might not bother to whitelist URLs, or they might forget, unless the system were to make it mandatory; however, if it were mandatory, it might deter adding external links at all.
When a non-whitelisted external link is added, the page will be flagged as having external links that need to be reviewed for possible spamminess. An authorized user (e.g. sysop) will review the link and, if it's okay, add it to the whitelist. External links added by authorized users, e.g. in the Editor or autoconfirmed groups, will be automatically whitelisted; if they are later determined to have been deep-cover spammers, then their additions to the whitelist table can be reviewed like any other kind of contributions. Perhaps instead of using a protected page to store the whitelist, create a new table for whitelisted external links which have no interwiki prefix.associated with them. This table will include some of the logging data, such as user_id of who added the item.
This would allow users to act on their own initiative while also providing for a means of review and a way of drawing attention to items in need of review. So, it would result in more links being whitelisted than would be the case if we relied on sysops adding everything to the spam whitelist. Incentives for spamming would be diminished by not allowing the pagerank benefits to arise until review has occurred. Labor costs of reviewers would be diminished by letting trusted users whitelist their links.
There would be at least some labor costs of reviewing external links and programming costs of implementation. If a review backlog occurred, then pagerank benefits for good links (especially of a time-sensitive nature, e.g. pertaining to current events) would be lost.
Implement different levels of whitelisting and blacklisting
There could be different levels of whitelisting and blacklisting; e.g. some URLs could be spammy enough that nofollow should be applied, but not so spammy as to be totally banned from being put in an article.
It would be helpful if there were sites that fell into this grey area of suspiciousness.
Is it conceivable that there would be some sites that should be greylisted in this manner?
The automated approval of links would require less labor to administer.
Some spammy links might slip through due to reviewers' inattentiveness. Also, this might require pretty frequent page reparsing as the links reverse-rot.
The patrolling system has already been developed, so less programming would be required than for developing a totally new kind of system.
Some wikis don't use patrolling. But then again, those are the same wikis that probably wouldn't use a FlaggedRevs-like system either.