User:LBowmaker (WMF)/Airflow job in 5 mins

Pre-requisite knowledge:
 * Your job will be scheduled using a tool called Airflow (to learn more click here).
 * The process will require you to copy and edit 3 files - a query file which contains your Hive query, a Python file which is used to schedule your query and another Python file that contains tests for your job.
 * Some basic knowledge of Git commands and code repositories.

Step 1: Create your query file
TO DO for DE/PA: We should create a new repo for Product Analytics hql files.

Once the new repo exists, run the following commands:

Make a copy of the example .hql file. Make changes as directed in the file. Save your .hql file under the appropriate folder.

Make sure to test your query on a stat box updating the command below with your inputs:

If everything works as expected then run:

Now go to your Git repo. Click 'New pull request' - leave base as the master and compare as the branch: your-new-job then click 'Create pull request'

Have other team mates or Data Engineers review the code, once all is approved click 'Merge' to the main branch.

Step 2: Create your scheduling file
Run the following commands:

Make a copy of the example Python file here. Make changes as directed in the file. Save your file under the product_analytics/dag folder. Now run:

Step 3: Creating your test file
Make a copy of the example Python file here. Make changes as directed in the file. Save your file under the tests/product_analytsics/your_dag_id folder. Now run:

Now go to: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests

Click 'New Merge Request', select source branch as the branch you just pushed and leave target as main. Continue to create the MR.

Step 4: Create a ticket for DE review
Once the above steps are complete create a ticket, tagged with 'data-engineering' and including the 2 links to the PR's you submitted.

Someone from the DE team will review and if all looks good it will be deployed.