Download File from MW by script
In advance: I'm not a programmer and my skills are very low - But I try.
I try a shell script to download things from a wiki. It shall download the semantic metadata from a Page in File:-namespace and furthermore download the file itself. The Wiki is closed so the script has to provide logindata an save the cookies. I use img_auth for the file access in the wiki.
So the script (based on some a similar one i found on http://labs.creativecommons.org/2011/04/30/using-wget-to-login-to-mediawiki/ and modified) does all, include saving the semantic data into a xml file using Special:ExportRDF but I can not get it downloading the File itself. My intention was to isulate the path to the file from the File: page first an then wget it from the wiki. Well img_auth will not pass this.
Any ideas are welcome. Thanks Ralf.
#!/bin/bash
MAIN_PAGE="https://PATHTOMYWIKI/index.php" PAGE_TITLE="Datei:NAMEOFFILE"
MW_LOGIN="MYLOGINNAME" MW_PASSWD="MYPATHWORD"
# Mediawiki uses a login token, and we must have it for this to work.
WP_LOGIN_TOKEN=$(wget -O - --save-cookies cookies.txt --keep-session-cookies \
${MAIN_PAGE}?title=Special:UserLogin \
| grep wpLoginToken | grep -o '[a-z0-9]\{32\}')
# We have to submit login to the Wiki
wget --load-cookies cookies.txt --save-cookies cookies.txt --keep-session-cookies \
--post-data "wpName=${MW_LOGIN}&wpPassword=${MW_PASSWD}\
&wpRemember=1&wpLoginattempt=Log%20in&wpLoginToken=${WP_LOGIN_TOKEN}" \
"${MAIN_PAGE}?title=Special:UserLogin&action=submitlogin&type=login"
# We fetch the semantic Date in RDF Format
wget -O ${PAGE_TITLE}.xml --load-cookies cookies.txt \
"${MAIN_PAGE}?title=special:ExportRDF/${PAGE_TITLE}"
# We fetch the path to file from the File: page. THIS IS WHERE I NEED A HINT.
#wget -O ${PAGE_TITLE} --load-cookies cookies.txt \
# "${MAIN_PAGE}?title=${PAGE_TITLE}" | grep ANYTHING THAT CONTAINS THE PATH TO FILE
#We fetch the file from the path fetched before. THIS IS WHERE I NEED A HINT.