User:Prianka/Pywikibot : Compat to Core Migration/ Progress Report

Weekly Progress Report for the FOSS OPW, Round 9 project. Corresponding Proposal.

How was your landing and your first meeting(s) with your mentors?
It had been quite an important breakpoint where I developed a more realistic perception about open source contribution norms which otherwise appeared to me as a very formal mode of working. I should state that the initial meetings have been significantly essential to motivate me towards my project due to the much supportive and friendly guidance been provided. I have been quite inspired by their working, organisation and task tackling approach and hope to follow them to avoid maximum mistakes which might happen otherwise. They have been quite friendly at times especially during the announcement and have maintained the most apt balance of working which I really appreciate.

What is the way of working that you have agreed? (tools in use, communication channels, meetings...)

 * Major tool for communication is Google Hangouts and IRC.
 * With time as work proceeds the project conpherence would be the major platform of discussion on vivid topics.
 * As discussed, we would be opting for a common online editor for my mentors' perusal at any point of the project on present script porting task.
 * Besides, e-mailing would be another method of communication.

Lessons learned since you applied for this OPW round and since you were accepted

 * I have become familiar with the working in Phabricator.
 * Studied and updated the material been provided for getting a deeper knowledge of the project.
 * Created conpherences on Phabricator to facilitate easy communication between my mentors and with my fellow intern working on Pywikibot.
 * Became familiar with necessary elements of the project like Git commands, Gerrit and community guidelines.

Project plan and Deliverables expected in the first half of the program.

 * A wider overview of the Deliverables expected in the first half of the program.
 * Start working on Important Scripts to be ported section of the Workboard.
 * Cover majority porting of scripts in Scripts being investigated section of the Workboard.

Phabricator project and tasks

 * Project: Pywikibot-compat-to-core
 * Tasks : Workboard

Week 1: December 9 - December 15

 * Went through the Pywikibot/Porting_Status and updated it according to the present status of each script.
 * Started work on Review sheet for related scripts.
 * Got familiar with the code structure and development guidelines.

Week 2: December 16 - December 22

 * Started by learning different methods used for testing the scripts.
 * Gave time for understanding the codes (pywikibot modules, interrelated functions and their usage) through existing earlier ported scripts.
 * Compilation of manual-documentation part. (i.e., looked over the resources which need to be included in the final manual for part 2 of the proposal.)

Week 3: December 23 - December 29

 * Started proper work on scripts by working on the following bugs:
 * Move match_images.py and move to scripts repository.
 * Port parsefunctioncount.py to core.
 * Port tag_nowcommons.py to core.
 * replacementfile for replace.py.
 * Corresponding status may be found here.

Week 4: December 30 - January 5

 * Continued work on the patches submitted.
 * Started using github and travis-ci together for script testing by:
 * Forking wikimedia/pywikibot-core repository in my github account.
 * Creating account in travis-ci.
 * Synchronizing the forked repository and learning yet another efficient method of script testing which is more reliable.
 * Presently working to fix an issue related to github due to some unwanted commits.
 * Learnt about usage of namespaces and page.py script more vividly.

Week 5: January 6 - January 12

 * Move match_images.py and move to scripts repository task got merged.
 * Started work on Port patrol.py to core.
 * Continued work on prevailing scripts.

Week 6: January 17 - January 24

 * Made progress in two other scripts.
 * Port replace.py -replacementfile from compat
 * Porting parser_function_count.py from compat to core/scripts

Week 7: February 5 - February 13
Due to some other engagements in work and other I took some days off. I have exams in the upcoming week (16th February, 2015 - 23rd February, 2015) due to which I shall take the last break. After this I'll resume back my work.
 * Made significant progress in scripts.
 * Merging tag_commons.py with nowcommons.py
 * Porting overcat_simple_filter.py from compat to core/scripts
 * Port patrol.py to core
 * Porting ndashredir.py from compat to core/scripts
 * Started working on Port warnfile.py to core branch.
 * Above needed familiarity with interwiki.py.
 * Started learning about wikimedia API for easier understanding of scripts and cases (page, category details) needed to test scripts' functionality.
 * Prepared a blog containing progress report and elaborate experience.
 * Started updating the review sheet here by taking reference from here.

Mid-Internship Report
I realize that my project is going quite behind the planned schedule. It has been because of many reasons.. One of the primary reason being I had wrongly estimated the time consumption required for each script. Besides, inevitable involvement in family or work too have made the progress slow.

As such the final way left for me is to get going as soon as my exams get over. I hope I might do the needful..

** A major drawback which contributed to my slow pace – less familiar with programming in Python.

Week 8 - 9: February 23 - March 4
It had been long since I added any post on my work status. So, here is one .. Well the week after the exam seemed quite confusing and loaded since I couldn't proceed with most of the scripts due to one or the other obstructing errors I got. The major issue is the requirement of the script which in most of the cases is not properly mentioned, as such I have compiled the list of issues that ate this week of mine because of which i didn't submit any important patch yet.

This is the list of the queries I had ..

''' Task one : Copyright repackaging. '''

Status : WARNING: THIS MODULE EXISTS SOLELY TO PROVIDE BACKWARDS-COMPATIBILITY. Do not use in new scripts; use the source to find the     appropriate function/method instead. Suggested : It's preferred not to expand the content which are meant to support backward-compatibilty with compat. As such, this move was not preferred.
 * I had updated the pywikibot/compat/query file so that it may support full functionality of copyright. (basically by adding new utility functions - CombineParams, ConvToList, ListToParam and  ToUtf8.
 * But there is a warning given in this script, i.e.,  - this makes me doubtful if I am proceeding right.

copyright.py -- Output  -- which seem is working well but is without Google API or Yahoo API I was referring to, ending.
 * I have  added copyright package by making it compatible with the core version.
 * Then I have added pywikibot/scripts/copyright folder with the following files:

copyright_put.py  Output In this I am stuck at line 184 (I am not sure which 'output' or 'pending' file it is referring to) - please suggest me how can i create one such 'output' file so that testing might be completed. (I found this point while debugging it and concluding that here the program stops because the condition is not fulfilled.

copyright_clean.py  -- I get output which seems to be working. __init__.py 
 * Additional files generated during testing -  BESIDES, as most of them seem to be working I shall push these files namely : core/script/copyright/4 files + core/copyright/exclusion_list.txt.

Query  1 : Do I need to add something else too?

Suggested : I was suggested which files need to be added.

 Task two :  Missing possibility to retrieve images from a page that were not included through templates       

Status :

For this what I understood is, getting a new function named say linkedpagesfromcontentparsing in pywikibot/page.py and should make use of regex search operations to execute it. Isn't it ? I am not very familiar with regex that's why I have stopped proceeding. The link is what I have done till now by trying to see the implementation in compat version. Please suggest if I am on the right track. ( A lot of changes need to be done)

Suggestion : Need to follow pep8 guidelines and it's better to have it as an argument like "content" which the default would be false and when it's true you parse the content instead of the links.

 Task three: Port warnfile.py 

Status :

Output  but If I manually create the file I get :

>&gt;&gt;➜  pywikibot-core git:(warnfile)✗ python pwb.py scripts/warnfile.py -lang:'test' family:'test'  interwiki-bot.log               >>&gt; Parsing warnfile... &gt;&gt; Fixing... 0 pages

where interwiki-log.bot is the log file generated using -log parameter with python interwiki.py.

As you had told me to go through the interwiki.py script, I inferred that it would be better in case you may give me example of some existing warnfile files so that I may use it for testing purpose since this warnfile.py is acting as mere a module which is imported in interwiki.py (L2441 in interwiki.py) if it gets parameter -warnfile:filename. So any pre-existing example might be very helpful for me to proceed as then I may test the script and then submit the patch after properly testing it.

Suggested: I have been suggested to use this command to run the script. python pwb.py interwiki -new -family:wiktionary -lang:en -dry -log -ns:14

 Task four:  Porting : splitwarning.py (not listed in phabricator) 

Status:
 * This script is supposed to split the log file (like interwiki-bot.log) but since I am not yet done with regex i guess that's hindering me to proceed.
 * Anyway what I inferred is that in the different log files i used (locally present in my repo) namely: interwiki.log, interwiki-bot.2.log,  interwiki-bot.log,  makecat-bot.log : it seems not been able to get any matching warning as it expects L29 . Please let me know what warning is it and how I may generate this warning again and thus test this script properly before submitting.

 Task five: Porting : piper.py (not listed in phabricator) 

Status
 * Output -- Initially it was working perfect but now I guess due to the new package added for copyright task it is spilling these messages . How may I get rid of these messages? Now it doesn't work anymore

Query 2: One more problem once I asked you earlier this issue: for piper.py as you said i moved the message to scripts/i18n/piper.py file and it's working as expected just that I couldn't understand the fact that for some scripts like blockpageschecker.py the i18n script has the same name but for different key in the message twtranslate seems to work but when i tried o change the key for message (dict) for piper it doesn't work so finally the name of the message script is piper.py and the keys for the dict is also piper

Why is it happening so ?

Suggested:  To checkout how category.py in i18n directory works..

 Task six: Porting standardize_interwiki.py (not listed in phabricator) 

Status7 Suggested: It would be good if you use pagegen argument handling (so it supports something -cat or -start or etc.) (use genFactory etc.)
 * Output--  Seems to be working well.
 * Please check the script once if I shall add something then i'll submit the patch.

Query 3: One more thing I see is that i get message in most of the scripts:

&gt;&gt;&gt;WARNING: /home/innovator/pywikibot-core/pywikibot/page.py:4751: UserWarning: Site test:test instantiated using different code "yi" &gt;&gt;&gt;link._site = pywikibot.Site(lang, source.family.name)

Is this fine or did I make some mistake or so?

Remark: It's not a problem of the script. So, this is the query list, since I have got their solution it's easier to proceed.

Week 9: March 4 - March 10

 * An easier to understand progress report may be found here in the review sheet sheet 1 - PRESENT STATUS.
 * Working on it.