Manual talk:Parameters to Special:Export

From MediaWiki.org

Jump to: navigation, search

Contents

[edit] links do not seem to actually work

The links do not seem to actually work, instead returning always and only the latest revision of the page in question?

  • I took a look at the source to Special:Export. It forces anything done with a GET to only get the most current version. I've added a couple of boxes to my own version of Special:Export so that the user can set the "limit" and "offset" parameters; I don't know how to change special pages, though. --En.jpgordon 00:20, 19 January 2007 (UTC)

[edit] Where is the source? How do I use "my own version"?

  • RFC: when limit=-n, dump the previoue n edits with the default being from the current edit.
  • Where is the source? How do I use "my own version"?

Thanks Doug Saintrain 19:39, 26 January 2007 (UTC)

Heya. The source I'm talking about is the MediaWiki source (I got it from the SourceForge project link over there in the resource box.) There's a comment in includes/SpecialExport.php saying // Default to current-only for GET requests, which is where the damage occurs. I imagine it's trying to throttle requests. So I instead made my version by saving the Special:Export page, tweaking it, and running it on my local machine; I only had to adjust one URL to get it to work right.

More fun, though; I wrote a little Python script to loop through and fetch entire article histories, a block of 100 revisions at a time (that being the hardwired limit), catenate them into one long XML, run it through anther filter, and then look at them with the History Flow Visualization Application from IBM.[1]. Pretty.

Hi, I am trying to do the same thing, 100 at a time and then concatenating them for a full history - any way you could share the fixed export and python script? Thanks. Mloubser 11:52, 13 November 2007 (UTC)
We shouldn't need limit=-n, should we? Isn't that what dir and limit should provide? My only problem, though, has been figuring what offset to start with for a backward scan. --En.jpgordon 07:43, 27 January 2007 (UTC)


Thanks for responding.

Mea culpa! I didn't even see "dir". Thanks.

The reason I wanted to look at recent history was to find at which edit a particular vandalism happened to see what got vandalized.

Is there a more straightforward way of looking for a particular word in the history? Thanks, Doug. Saintrain 04:46, 29 January 2007 (UTC)

Y'know, we almost have the tools to do that. The aforementioned history flow tool knows that information; I just don't think there's a way to glean it from it. --En.jpgordon 00:08, 2 February 2007 (UTC)

[edit] Discussion

Hi, is there a way to get just the total number of edits an article has had over time? Thanks! —The preceding unsigned comment was added by 87.196.51.250 (talkcontribs) 20:55, 20 September 2007. Please sign your posts with ~~~~!

As far as I can remember there is no way to get only this number (but I might be wrong). Anyway, this number can probably be easy calculated using the appropriate parameters to API. Tizio 10:02, 21 September 2007 (UTC)

[edit] Parameters no longer in use?

Using either the links provided in the article, or attempting to add my own parameters does not yield the desired results. I can only get the most recent version of the article, regardless of how I set parameters. I've tried it on several computers running Linux or windows, and at different IPs. Same problem, the parameters seem to be ignored. --Falcorian 06:59, 14 January 2008 (UTC)

I've had it suggested to use curonly=0, but this also has no effect. --Falcorian
I also found that the links given did not work, nor did any experiments creating my own urls to get the history. However, submitting the parameters via a ruby script did work. I don't know enough yet (about HTTP, html forms) to understand why this approach worked and the url approach did not, but anyway here is some code that successfully retrieved the last 5 changes to the page on Patrick Donner, and writes the output to a file:
res = Net::HTTP.post_form(URI.parse("http://en.wikipedia.org/w/index.php?"), 
  {:title=> "Special:Export", :pages =>'Patrick_Donner', :action => "submit", :limit => 5, :dir => "desc"})
f = File.new("donner_output_last_5.txt", "w")
f << res.body
f.close

Hope this helps. I wish I knew enough to provide a more general solution. Andropod 00:44, 17 January 2008 (UTC)

When you use the URL as in a browser, you are submitting via GET. In the above ruby script, you are using POST. This seems the solution, as for example:
curl -d "" 'http://en.wikipedia.org/w/index.php?title=Special:Export&pages=Main_Page&offset=1&limit=5&action=submit'

worked for me. Before updating this page, I'd like to check this with the source code. Tizio 12:46, 21 January 2008 (UTC)
Works for me as well, which is great! Now to crack open the python... --Falcorian 03:29, 26 January 2008 (UTC)
For future reference, I get an access denied error when I try to use urllib alone in python to request the page. However, if I use urllib2 (which allows you to set a custom header), then we can trick Wikipedia into thinking we're Firefox and it will return the page as expected. --Falcorian 06:57, 26 January 2008 (UTC)
import urllib
import urllib2
 
headers = {'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4'} # Needs to fool Wikipedia so it will give us the file
params = urllib.urlencode({'title': 'Special:Export','pages': 'User:Falcorian', 'action': 'submit', 'limit': 2, })
req = urllib2.Request(url='http://en.wikipedia.org/w/index.php',data=params, headers=headers)
f = urllib2.urlopen(req)
print f.read()
This doesn't work for me. It doesn't stop at 2 versions. Neither does dir=desc work. --89.138.43.146 15:15, 25 July 2009 (UTC)
I have tried all of the above with urllib2 and with getwiki.py but it seems to me that the limit parameter has stopped working? Is this the case? --EpicEditor 1:03, 30 September 2009 (UTC)

[edit] Other parameters

I found these parameters in the source code:

curonly 
appears to override the other parameters and makes only the current version exported
listauthors 
export list of contributors (?) if $wgExportAllowListContributors is true
wpDownload 
returns result as a file attachment: http://en.wikipedia.org/w/index.php?title=Special:Export&pages=XXXX&wpDownload
templates 
images 
(currently commented out in the source code)

I don't know what listauthors does exactly, maybe it's disabled on wikien. Tizio 15:40, 21 January 2008 (UTC)

Also, variable $wgExportMaxHistory is relevant here. Tizio 15:43, 21 January 2008 (UTC)

Also missing: "history" has a different meaning when used in a POST request (use default values for dir and offset, $wgExportMaxHistory for limit). Tizio 15:49, 21 January 2008 (UTC)

[edit] Recursive downloading

Hi, is there some way (without writing my own script) to recursively download the subcategories inside of the categories? I don't want to download the whole wikipedia database dump to get the 10,000 or so pages I want. Thanks, JDowning 17:32, 13 March 2008 (UTC)

[edit] Export has changed

All the examples (and my script which has worked for months) return only the newest version now. Anyone have ideas? --Falcorian 05:58, 24 September 2008 (UTC)

[edit] disable Special:Export to users non-sysop

Hello !

What's the best to do to disable Special:Export from some user rights ? Thanks--almaghi 14:40, 27 April 2009 (UTC)

See the main page, you have to change localsettings.php Rumpsenate 16:32, 15 July 2009 (UTC)

[edit] Bug report on special export

[2]

This might be misunderstanding. The description says "if the history parameter is true, then all versions of each page are returned." Rumpsenate 16:13, 15 July 2009 (UTC)

[edit] How to

I am trying to export en:Train to the Navajo Wikipedia (nv:Special:Import) as a test, but I am not having any luck. I don’t know much about commands or encoding, and I’m not certain that nv:Special:Import is properly enabled. I typed Train in the Export textbox and pressed export, and it opened a complex-looking page in my Firefox, but I can’t figure out what to do next. Nothing appears in nv:Special:Import. What am I missing? Stephen G. Brown 14:38, 17 September 2009 (UTC)