Content translation/Deployments/How-to/TPA

This is how-do document to update Template Parameter Alignment database in the cxserver.

Connect to stat100x
Open,

This will open JupyterHub, which requires LDAP password to login.

Starting notebook
Make sure to check Kerberos authentication timeout first. Default is set to 48 hours now.

Extend it by running kinit:

Running scripts

 * 1) Open terminal and clone:


 * 1) Update   for pairs requires to generate template parameter alignments.


 * 1) Run all notebooks in order.


 * 1)   overwrites existing output files if it runs again, so it is better to save produced JSON files (eg: templates-articles_xx.json and templates-summary_xx.json) in other directory to avoid losing data. For large languages like en, it can be reused if we are running process within few days, this will save time.


 * 1) While running , make sure that Wikidata partition is up-to-date.

Updating database
Run:  from cxserver pointing all generated files from the process.

This will update new templatemapping.db in the same folder. Use  command (available with sqlite3-tools package in Linux) to see difference between old and new database.

Copy it to  and submit patch for review. This database can be open with sqlite command to check number of template parameters updated.

eg:

Useful resources

 * All about Conda envionment: https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Conda


 * Issues related to Kerberos access: https://wikitech.wikimedia.org/wiki/SWAP#Access_and_infrastructure


 * Jupyter at Wikitech contains useful information: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter