About this board
- I was just reading your thread on Link rot talk. Then I had a look at the six bugs you posted on the JIRA toolserver site. I don't have an account on JIRA, so am just mentioning this to you here. Everything you're doing looks good to me. And you have been working on this over a year, based on your entries. I noticed that a query against WebCite was one of your six items. But you might want to confirm that WebCite is capable of handling additional load from Wikipedia. I mention that because of a response to an inquiry in section 3
- "Quite simply put WebCite cannot handle the volume that Wikipedia provides, even the small run of 10-50 PDFs a night by Checklinks seems to be contributing to the problem."
- Date was June 2010, so things might have changed. And you might be doing something different than what that inquiry was about. Sounded like that referred to a batch load of nightly links to be archived by WebCite, whereas you are (maybe?) planning on querying WebCite to see if broken URLs are archived there already. Maybe that isn't nearly as resource intensive. Just a few thoughts I wanted to share. Thanks for what you're doing here! --FeralOink (talk) 10:40, 20 February 2012 (UTC)
For a few months the content of my deadlinks-*.dat files has become impossible to understand, eg:
(dp0 Vhttp://www.paremia.org/paremia/P13-20.pdf p1 (lp2 (Va beau mentir qui vient de loin p3 F1378644901.783 S'404 Not Found' p4 tp5 as.
And as the script didn't modify the pages itself, I've developed the following module which has already fixed this on all the French wikis, into several thousands of pages for six months: w:fr:Utilisateur:JackBot/hyperlynx.py.
For the other wikis we would need to change the templates names and parameters, which would also allows to translate them into the wiki language as I've made from English to French. JackPotte (talk) 13:06, 8 September 2013 (UTC)
No, it doesn't.
Is there an easy way to implement it to ignore them; I can live with it if not, but a fix would be useful
Probably not. We do love patches though, see Manual:Pywikibot/Gerrit for instructions on how to get started, or ask in #pywikipediabot :)
I have asked in the IRC channel, and it needs a complete rewrite to do this. I don't have the time ATM, so will need someone else to do this; Feel free to mark as resolved.
Cant generate .txt
I ran a
python weblinkchecker.py -start:!week ago . It went through our wiki at wikiskripta.eu fine and produced .dat Today I ran
python weblinkchecker.py -repeat and no .txt file appeared. Thank you for your advice.