I don't know what the database dumps contain or how they differ from the database copy available on Toollabs.
The revision table has a field called rev_deleted
which tells if the revision has been deleted using revision deletion. If the entire page is deleted, the revisions end up in the archive table instead. In both cases, the text seems to reside in the text table (provided that the content wasn't deleted before upgrading the servers to MediaWiki 1.5, in 2005 or something).
I think I tried to look up something in the text table some time ago but failed, although I don't remember exactly what happened. Note that there is no key on the old_text
field, so the execution time of a query will be proportional to the number of entries in the table (very slow) as opposed to the logarithm of the number of entries (faster). Also, the documentation says that the text can be difficult to get (for example, it may be gzipped). I don't know if WMF stores page content in compressed form or not.