Manual talk:Pywikibot/replace.py

Ignore links
I am trying to replace something in the text, but not if it appears within the links. I am not sure how to do it, as both the old and new forms appear in the links.

diacritics
How can I use with this bot diacritics (UTF-8) ? It is very important for Romanian Wikipedia. --Romihaitza 17:26, 6 February 2006 (UTC)
 * I've just converted the article names and it's working. JackPotte 01:32, 16 November 2009 (UTC)

Bug
When loading an entry list via -file: parameter, if the wikilinks have accents or any non plain ascii, bot borks and dies before doing makign any changes (even if there are thousands of entries before that one which are plain ascii). For example: Talk:Benjamín Urrutia (on en:) makes the bot break because of the accented i. 132.248.81.29 23:48, 10 March 2006 (UTC)
 * forgot to sign drini &#9742; 23:49, 10 March 2006 (UTC)

Problem
It has a problem - if i open replace.py, i get a question "Please enter the text that should be replaced", and when i typing it the text and press enter, i get the second question "Press enter the new text", typing the new text and i press enter and i press enter 2 times again, but the command window disappear immediately. --IL 18:18, 18 March 2006 (UTC)


 * Sounds like you haven't specified the range of pages to check for the misspelling. When you run "python replace.py", try appending something like "-start:!" or "-start:A". 129.21.121.171 06:11, 18 May 2006 (UTC)
 * So in other words, without computer jargon. Type:
 * replace.py -start:A
 * Odessaukrain 22:33, 24 May 2008 (UTC)

Cat-based replace
Category-based replace seems to be invariably giving no results, Getting 0 pages from .... Is this a bug or am I missing something? I've tried with both the category name itself and with the Category: prefix, as well as with spaces in the name and underscores in the name. Help? --88.113.114.226 16:37, 31 May 2006 (UTC)


 * Maybe someone did not add the category to the pages but did list the articles there manually, then it might not work. Give the page where you have the problem please, greetings --birdy geimfyglið (:> )= 19:36, 31 May 2006 (UTC)

I can't seem to get Category-based replace to work either. Here's the command I used that failed: python replace.py -cat:Front_groups "" "" --Sheldon Rampton 04:40, 23 December 2007 (UTC)


 * I experienced it like you can only work on categories that indeed contain pages. The bot won't run on pages in subcategories. --Plasmarelais 18:14, 30 August 2009 (UTC)

Finding pages
This script seems to have a hard time finding pages. It can get pages from Special:Allpages, even edit articles (with edit_article.py), but whenever I try to replace something, I get a face full of "Page X not found". Any help? The Mu 20:11, 18 August 2006 (UTC)


 * Someone reported a similar problem with replace.py for non-WMF sites on #pywikipediabot. Not sure what the problem is, as other tools seem to work.  --Connel MacKenzie 19:00, 10 September 2006 (UTC)


 * Yeah, thats me, I have the same problem. The -ref: and -start:! options work all 'fine' but I always get "Getting xx pages from mywiki:en" followed by a number of "Page Foobar not found". Most other scripts work just fine. I work from Windows XP to Mediawiki 1.6.8. I've also just downloaded the september 5 version of the pywikipedia framework. --GrandiJoos 20:57, 10 September 2006 (UTC)


 * Update: weblinkchecker.py also reports "Article_X does not exist"... --GrandiJoos 08:06, 11 September 2006 (UTC)


 * Fix: in localsettings add

$wgGroupPermissions['bot' ]['export']          = true; Thanks to Andre Engels. --GrandiJoos 11:07, 2 October 2006 (UTC)

Problem with utf-8
한국 월드컵 미국 나비

python replace.py -file:articles_list.txt "errror" "error"


 * I make a file, and run my bot...
 * RESULT: error!!
 * Korean exlorer, wikipedia use utf-8 (unicode)
 * cmd use not utf-8
 * and python also use not utf-8
 * help me~!!
 * How setting for me? -- WonYong ( Talk / Contrib ) 11:46, 11 September 2006 (UTC)
 * I've just converted the article names and it's working. JackPotte 01:30, 16 November 2009 (UTC)

Bot problem (korean language)

 * I have a question.
 * I input 스모그 to interwiki.py in WIN XP's cmd.exe console.

C:\pywikipedia>interwiki.py -start:스모그 -autonomous


 * following is result...


 * Checked for running processes. 1 processes currently running, including the current process.
 * NOTE: Number of pages queued is 0, trying to add 60 more.
 * Retrieving Allpages special page for wikipedia:ko from %C2%BD%C2%BA%C2%B8%C3%B0%C2%B1%C3%97, namespace 0
 * Getting 60 pages from wikipedia:ko...


 * and, I input 스모그 to web wikipedia (ko:)
 * thus, connected to link as follows:

http://ko.wikipedia.org/wiki/%EC%8A%A4%EB%AA%A8%EA%B7%B8


 * Why? is different?


 * %EC%8A%A4%EB%AA%A8%EA%B7%B8
 * I input 스모그 to web wikipedia (ko:)
 * %C2%BD%C2%BA%C2%B8%C3%B0%C2%B1%C3%97
 * I input 스모그 to cmd console


 * It is a bug??
 * and, I input by file method...


 * C:\pywikipedia>
 * C:\pywikipedia>copy con aa.txt
 * 스모그
 * ^Z
 * C:\pywikipedia>interwiki.py -file:aa.txt
 * Checked for running processes. 1 processes currently running, including the current process.
 * NOTE: Number of pages queued is 0, trying to add 60 more.
 * Dump ko (wikipedia) saved
 * Traceback (most recent call last):
 * File "C:\pywikipedia\interwiki.py", line 1467, in ?
 * bot.run
 * File "C:\pywikipedia\interwiki.py", line 1200, in run
 * self.queryStep
 * File "C:\pywikipedia\interwiki.py", line 1174, in queryStep
 * self.oneQuery
 * File "C:\pywikipedia\interwiki.py", line 1132, in oneQuery
 * site = self.selectQuerySite
 * File "C:\pywikipedia\interwiki.py", line 1114, in selectQuerySite
 * self.generateMore(globalvar.maxquerysize - mycount)
 * File "C:\pywikipedia\interwiki.py", line 1050, in generateMore
 * page = self.pageGenerator.next
 * File "C:\pywikipedia\pagegenerators.py", line 162, in __iter__
 * for pageTitle in R.findall(f.read):
 * File "C:\Python24\lib\codecs.py", line 481, in read
 * return self.reader.read(size)
 * File "C:\Python24\lib\codecs.py", line 293, in read
 * newchars, decodedbytes = self.decode(data, self.errors)
 * UnicodeDecodeError: 'utf8' codec can't decode byte 0xbd in position 2: unexpected code byte
 * C:\pywikipedia>


 * korean lanuage is broken...:(
 * help~!! :( -- WonYong ( Talk / Contrib ) 11:49, 11 September 2006 (UTC)
 * Acording as korean wiki admin, use it as following:
 * 1.

C:\pywikipedia>interwiki.py -start:스모그 -autonomous (X)

C:\pywikipedia>interwiki.py -start:%EC%8A%A4%EB%AA%A8%EA%B7%B8 -autonomous (O)


 * So, I use it.
 * 2. and, edit user-config.py file.

console-encoding = 'cp949'
 * 3.

cmd.exe /U
 * 4. cmd font change to truetype korean font
 * RESULT: cmd Output is good. korean font output is not broken. WOW!!


 * but, I also want NOT BROKEN INPUT as following:

C:\pywikipedia>interwiki.py -start:스모그 -autonomous (O)
 * help!! -- WonYong ( Talk / Contrib ) 11:59, 11 September 2006 (UTC)

regex BUG
tried with several version of pywikipedia any change with regular expression leads to a crash can you fix this huge bug? thanks

Bug: "\n" in replacement text
I can't put a newline in the replacement text. For example: python replace.py "AbcDef" "Abc\nDef" actually inserts the literal characters "\n" rather than a newline.


 * If you are using a bash command line, you can add a literal new line by simply press enter:

$ python replace.py "AbcDef" "Abc > Def"

Cheers, John Vandenberg 23:16, 27 September 2007 (UTC)

Correct command for interwiki purposes
Suppose that we have a page, say w:Door, with 19 links to it. We decide to tarnswiki it to Wiktionary, and we want all links to w:Door to be directed towards wikt:door. What is the correct command to achieve that, using replace.py? Huji 17:28, 27 August 2007 (UTC)

use xml dump
if I use a xml dump with the option -xml the program must modify not only the online page, but also the xml version, so if I redo the same operation the script finds that the page is already modified. --Wiso 15:36, 27 September 2007 (UTC)

Isredirect and putthrottle
2 issues:

I have tried to edit pages using -ref:, and when I use that, it says "Isredirect". When I don't use that, it works.

I have also tried putting -putthrottle: to 12, and it does it in shorter amounts of time.

why are these happening?

How could I replace the text wich contain two or more lines?
For example I must change the text:

==Subtitle==

with the following:

==Subtitle==

How could I do it? The problem is that I could not press enter while typing the replaceing text and I couldn't use  because it will not cause the expected effect --A1 20:04, 3 December 2007 (UTC)


 * Hi, I would make it like this, modify replace.py the following way:
 * remove everything between

elif fix == None:
 * and

else: # Perform one of the predefined actions.
 * Then put in between them

replacements.append((u'\=\=Subtitle\=\=',u'==Subtitle==\r\n')) wikipedia.setAction(u'summaryhere')
 * Name the file other than replace.py (e.g. myreplace.py)
 * Run with myreplace.py -regex option.
 * Best regards, --birdy geimfyglið (:> )= 20:20, 3 December 2007 (UTC)
 * Thank you but it doesn't work. Am I right:

elif fix == None: replacements.append((u'\=\=Subtitle\=\=',u'==Subtitle==\r\n')) wikipedia.setAction(u'summaryhere') else: --A1 21:46, 11 December 2007 (UTC)


 * Worked perfectly for me, did You save the modified.py file as UTF-8? Did You run with -regex option? Best regards, --birdy geimfyglið (:> )=  ∇  04:22, 8 February 2008 (UTC)
 * I think it's easier to use -regex with "==Subtitle==" "==Subtitle==\n " --Plasmarelais 16:49, 29 September 2009 (UTC)

And how to do replacing: ==Subtitle==

with the following:

==Subtitle==

? JAn Dudík 21:29, 4 January 2010 (UTC)
 * It depends: does the subtitles and templates vary? It's much easier if they're always the same. --Plasmarelais 15:52, 18 January 2010 (UTC)

Nothing happens
I need some help with replace.py

I installed Python25 and Pywikipedia on my computer

The purpose : use it on a MediaWiki installation (in french), in version 1.6.7

I configured my user-config.py, my [name_of_the_wiki]_family.py that I put in families directory...

And i launched the /cmd on Windows in order to use Python

I've already created an account for the bot on the Wiki ( its name is Orthobot)

Then I launch python.exe login.py (and there isn't any problem)

Then I launch replace.py like that :

python.exe replace.py -page:Abheva "sanscrit" "sanskrit"

The bot finds the page, finds the word, asks me if I want to replace it...

And it says the page has been replaced.

But the problem is that, when I look at this page, the replacement hasn't been made

So, I don't understand : everything seems to be ok but It doesn't work

Any idea ?

Urobore 22:39, 7 February 2008 (UTC)


 * Try the following: run replace.py -page:Abheva (without anything else) and type enter the text when prompted.
 * If You can't get it to work, You will find more people to be able to help You at
 * Best regards, --birdy geimfyglið (:> )=  ∇  22:52, 7 February 2008 (UTC)


 * Thank you but, unfortunately, it didn't work. About the IRC channel, in fact, I come from there... If somebody has another idea... Thanks, anyway ! Urobore 22:58, 7 February 2008 (UTC)


 * Some thoughts: is the page locked? Or semiblocked (and You just registered), do You have the newest version of pywikipedia (easiest is to update via tortoise). Do You have a link to this page please? --birdy geimfyglið (:> )=  ∇  23:04, 7 February 2008 (UTC)


 * No, the page isn't locked, nor semiblocked. However, I just found a means to resolve the problem even if I don't understand why. In fact, it is a public Wiki but blocked for non-registered users (only registered one can Edit a page). Moreover, a non-registered user can't create a new account (it means I create accounts for users who wants to participate to the wiki). Concretly, my LocalSettings.php contains :






 * So, I put the second line = true instead of false and it resolved the problem (the page "Abhava" has been edited).


 * It means I'll have to edit my  each time I'll want to use a Pywikipedia. But I don't understand exactly why because the Bot has an account and should be able to edit a page without the autorization as a non-registered user : in Edit History of the page "Abhava", the Bot only appears as a non-registered IP address (mine) and not as a user. It means there should be a problem with the login of the Bot (it can't log in on the wiki and, so, is able to edit a page only as a non-registered user). So : have you got an idea about this problem of login identification of the bot ?


 * Thank you, anyway.


 * Urobore 07:10, 8 February 2008 (UTC)

More than one thing to replace
Is there a way to replace two or more strings at the same time? I mean: to replace at the same time "Errror" for "Error" and "Miistake" for "Mistake", without having to run the bot twice in the same category? Thank you!--Xtv 11:47, 18 September 2008 (UTC)


 * Very intresting question. If there was such a feature, it would be very useful for me too. So is there a way? Thanks a lot! --Plasmarelais 23:37, 15 January 2009 (UTC)
 * Yes, it's posible. Just write replace.py and then hit Enter. It will ask you what you want to replace, and you will be able to replace more than one thing in the same operation.--Unai Fdz. de Betoño 12:05, 30 August 2009 (UTC)
 * Or you just define several replacements like

replace.py -start:! "errror" "error" "miistake" "mistake" "wroong" "wrong"
 * As long as you give an even number of replacements. --Plasmarelais 18:07, 30 August 2009 (UTC)

Working in all namespaces
How to make bot work in all namespaces using "-start"? Because when I type "-start:!" it works only in main namespace. 77.253.22.92 00:00, 20 September 2008 (UTC)


 * Hello, for example -start:Template:! should do the trick for a specific namespace, see also the -namespace: section in Replace.py, best regards, -- birdy geimfyglið  (:> )= 00:04, 20 September 2008 (UTC)


 * But this way bot still works in one specific namespace, and I wanna make him work in all of them. And, if I understand well, -namespace cannot be used with -start, and I have to use -start, as I don't have possibility to make xml dump. 87.205.65.183 23:05, 21 September 2008 (UTC)
 * As far as I know, Pywikipediabot doesn't support that. You have to run the bot for each namespace. Huji 17:17, 22 September 2008 (UTC)
 * Try to use the -namespace:nn several times in one run. --Plasmarelais 18:08, 30 August 2009 (UTC)

Replacing birngs new text
Hi! I'm new on working with bots, but i feel myself confronted with a problem: everytime i replace one word for another, theres always an extra text coming with the new word:

Von Memory Alpha, einer Wikia-Wiki. How can i avoid that extra text?

Thank you very much for any help! --Plasmarelais 15:30, 13 January 2009 (UTC)


 * I found the answer on MA/en: http://memory-alpha.org/en/wiki/Memory_Alpha:Bots. Anyway thank you! --Plasmarelais 22:23, 13 January 2009 (UTC)

Additional parameters
Replace.py -help shows several options which are not described on Replace.py page.

Is it working options or obsolete? e.g -uncatfiles? --Dnikitin 10:28, 17 February 2009 (UTC)
 * You should assume that the file itself is correct. This page is probably outdated. In any case, -uncatfiles is indeed available. --Erwin(85) 19:45, 17 February 2009 (UTC)


 * The option is a general one not specific to replace.py, it's from pagegenerators.py. -- User:D2

Infobox field samples
It would be nice to have some samples for infobox field updates. -- User:D2

Question
I have the next problem when I were testing the script:

python replace.py -page:Usuario:Ezarate73/pruebas -regex "[aeiou]staba más" '\1' Getting 1 pages from wikipedia:es... /home/esteban/pywikipedia1/pywikipedia/weblinkchecker.py:808: SyntaxWarning: name 'day' is assigned to before global declaration global day Traceback (most recent call last): File "replace.py", line 705, in    main File "replace.py", line 701, in main bot.run File "replace.py", line 376, in run new_text = self.doReplacements(new_text) File "replace.py", line 344, in doReplacements allowoverlap=self.allowoverlap) File "/home/esteban/pywikipedia1/pywikipedia/wikipedia.py", line 3491, in replaceExcept    replacement = replacement[:groupMatch.start] + match.group(groupID) + replacement[groupMatch.end:] IndexError: no such group

Where's the error? Thanks --Ezarate 03:35, 8 June 2009 (UTC)

Replace Quotationmarks (solved)
Is it possible to replace a string _including_ some quotationmarks?

I want to replace "abc" with abc. (Get rid of the quotation marks...)

Thanks alot! --91.33.247.68 05:52, 22 September 2009 (UTC) (Felix)
 * Think you may use -regex:

replace.py -regex "\"abc\"" "abc" -start:!
 * if you type \" instead of " it is regonized as part of the string. --Plasmarelais 07:41, 22 September 2009 (UTC)
 * Thank you very much! --131.234.103.164 11:06, 23 September 2009 (UTC) (Felix)

Exceptinsidetag — вот из this?
Could you please explain what is exceptinsidetag parametr means. What is "tag" in articles? ChVA 08:46, 31 January 2010 (UTC)
 * The parameter makes the bot skipping pages, that contain certain text inside tags like etc. For example  will skip all pages containing
 * Some text  TEXT text.
 * --Plasmarelais 18:40, 31 January 2010 (UTC)

-search parameter not working
I'm trying to start replace.py with these parameters: @~/Programs/pywikipedia$ python replace.py -family:wikipedia -search:"periodo de tiempo" -regex "([Pp])er([íi])odo de tiempo" "\1eríodo" But I'm getting this error message: Traceback (most recent call last): File "/home/kved/Programs/pywikipedia/pagegenerators.py", line 872, in __iter__ for page in self.wrapped_gen: File "/home/kved/Programs/pywikipedia/pagegenerators.py", line 804, in DuplicateFilterPageGenerator for page in generator: File "/home/kved/Programs/pywikipedia/pagegenerators.py", line 527, in SearchPageGenerator for page in site.search(query, number=number, namespaces = namespaces): File "/home/kved/Programs/pywikipedia/wikipedia.py", line 5641, in search 'srsearch': q, NameError: global name 'q' is not defined global name 'q' is not defined I asked about this and when he run on his computer the exact same parameters, the bot works! [drini@bloqueame pywiki]$ python replace.py -family:wikipedia -search:"periodo de tiempo" -regex "([Pp])er([íi])odo de tiempo" "\1eríodo" Getting 60 pages from wikipedia:es... No changes were necessary in Cuaresma No changes were necessary in Mes No changes were necessary in Período Copernicano No changes were necessary in Tiempo pascual No changes were necessary in Prehistoria de América

>>> Abriles <<< - * Periodo de tiempo correspondiente a un año. + * Período correspondiente a un año.

Do you want to accept these changes? ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll, [q]uit) I'm running Ubuntu 9.10 with Python 2.6.4, last revision of pywikipedia downloaded through svn from http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/ Any ideas why the bot isn't running in my computer? KveD (talk) 02:42, 17 March 2010 (UTC)
 * Let me translate here the French help page:
 * The versions of Python 2.3 and before can partially work with Pywikipedia.
 * The versions 2.4, 2.5, 2.6 are completely compatible.
 * Python 3 and next are not compatible.
 * So your bug must come from elsewhere. JackPotte 21:47, 17 March 2010 (UTC)
 * Exactly, the Python version isn't the problem. But I runned out of ideas. KveD (talk) 00:33, 18 March 2010 (UTC)
 * Only ideas I have about is:
 * Are you both using the same version of pywikipedia framework?
 * Maybe one of you has made edits to related scripts like pagegenerators.py, wikipedia.py etc.?
 * Problems with codecs? Differences within your family-files?
 * Please excuse if I'm embarresing u with questions you've already found the answer for. --Plasmarelais 17:19, 2 April 2010 (UTC)

find Text get it and Replace with it
How can I get a text from Title only the Title and replace another string say to  using -regex
 * I found faster to develop my own script than to find if it's possible in regex. JackPotte 10:07, 19 September 2010 (UTC)