Manual talk:Database access

Thanks for starting this page.

ISNULL
How do I manage this? --5.83.136.21 17:33, 25 October 2019 (UTC)
 * ISNULL(t.field,substitute_value) -> MSSQL
 * IFNULL(t.field,substitute_value) -> MySQL
 * COALESCE(t.field,substitute_value) -> PostgreSQL

Not very well described
Where does it say what does "fname" or "options" or "vars" mean?
 * I believe those are described here. Tisane 20:11, 26 February 2010 (UTC)
 * I agree like almost all wikipedia technical and mediawiki articles, this article is terribly written, and difficult to understand for those who are not computer programmers. Errectstapler 14:03, 4 May 2011 (UTC)
 * I added a much needed explanation of how to login to mysql. Errectstapler 18:24, 4 May 2011 (UTC)

The information was reverted, here it is for others:

Errectstapler 19:56, 4 May 2011 (UTC)

SQL Injection
How safe are these functions against SQL Injection? Does the extension-coder have to take care of this themselves?
 * If you feed an array to select, it's safe. If you construct a string, even a string to be passed to ::select, then it's up to you to take care of safety. Gigs 03:07, 30 December 2010 (UTC)

select-statement
Where would I find documentation on the select-statement (no, not sql in general)?

I would assume:

$res = $dbr->select(Tables, Fields returned, WHERE-clause, calling function, options);


 * Tables:array of tables, if more then on, where-clause needs a join Example: array('pagelinks', 'page') or just 'page'
 * Fields returned: array of fields returned Example: 'pl_title'
 * WHERE-clause: array which holds all conditions, will be joined with AND (And how do I OR clauses?) Example: array('page_id = pl_from', 'pl_namespace' => NS_TEMPLATE, 'page_namespace' => $disPageObj->getNamespace, 'page_title' => $disPageObj->getDBkey)
 * calling function: just for the fun of it? Example: 'DisambiguationsPage::getSQL'
 * options: seems to be added after the WHERE clause, like LIMIT or ORDER BY

Full example: $res = $dbr->select(array('pagelinks', 'page'),	 	    'pl_title',	             array('page_id = pl_from', 'pl_namespace' => NS_TEMPLATE, 'page_namespace' => $disPageObj->getNamespace, 'page_title' => $disPageObj->getDBkey),		   'DisambiguationsPage::getSQL' );


 * The WHERE clause is hard coded to use AND and only do equality when you give it an array. An important caveat is that if you give a string instead of an array, you have to addQuotes and do all your sanitation manually.  I have updated the page accordingly. Gigs 03:02, 30 December 2010 (UTC)

Multiple tables with JOIN
How can I do a nested thing like this SELECT Orders.OrderID, Customers.CustomerName, Shippers.ShipperName FROM (   (Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID) INNER JOIN   Shippers ON    Orders.ShipperID = Shippers.ShipperID); in the select(...) style? --5.83.136.21 00:35, 25 October 2019 (UTC)

Solution
I DID IT!!!

The situation was the following: CREATE TABLE `table1` ( `id` int ...,  `title` varchar ...,  ... ); CREATE TABLE `table2` ( `id` int ...,  `value` int ...,  ... ); CREATE TABLE `table3` ( `id` int ...,  `sub_id` int ...,  ... ); CREATE TABLE `table4` ( `sub_id` int ...,  `stuff` varchar ...,  ... ); CREATE TABLE `table5` ( `id` int  `value` int ...,  ... );

I've worked out a SQL statement that does what I want: SELECT table1.id AS `id`, table1.title AS `title`, table2.value AS `value1`, table3.value AS `value2`, table2.value - table3.value AS `value3`, DATE AS `date`, table4.stuff AS `stuff` FROM (       (            (                (                    `table1` INNER JOIN `table2` ON table1.id = table2.id               )            INNER JOIN `table3` ON table1.id = table3.id            ) INNER JOIN `table4` ON table3.sub_id = table4.sub_id )   INNER JOIN `table5` ON table5.id = table1.id AND table5.date = '2018-12-31'    ) WHERE ... ORDER BY   table2.value - table3.value

The result is: ++---+-+-+-+--+---+ ++---+-+-+-+--+---+ My problem was how to insert this SQL statement into the array sets. I figured that out: public static function getQueryInfo {
 * ID | Title | Value 1 | Value 2 | (Value 2 - Value 1) | Date | Stuff |

return [ 'tables' => [ 'table1', 'table2'2a, 'table3'3a, 'table4'4a, 'table5'5a ], 'fields' => [ 'id'     => 'table1.id', 'title'  => 'table1.title', 'value1' => 'table2.value', 'value2' => 'table3.value', 'value3' => 'table2.value - table3.value, 'stuff'  => 'table4.stuff' ],			'conds' => [ 'some_field' => 0, 'another_field' => 0, ],			'join_conds' => [ 'table2'2b => [ 'INNER JOIN', 'table1.id = '. 'table2.id' ],				'table3'3b => [ 'INNER JOIN', 'table1.id = '. 'table2.id' ],				'table4'4b => [ 'INNER JOIN', 'table3.sub_id = '. 'table4.sub_id' ],				'table5'5b => [ 'INNER JOIN', 'table5.id = '. 'table1.id AND '. 'table1.date = \'2001-01-01\'' ]			]		];	}

public static function getQueryInfo( $myDate ) { global $wgDBprefix;
 * 1) The nested JOINS are written one after the other, separated by commas. That's easy, if you know that! ;-)
 * 2) The order of 2a, 3a, 4a, 5a and 2b, 3b, 4b, 5b is decisive. That's hard to know! I went as mad as hell because it didn't work, because I had a different order in the 'tables' array.
 * 3) Do not forget $wgDBprefix. So I finally got that:

return [ 'tables' => [ 'table1', 'table2', 'table3', 'table4', 'table5' ], 'fields' => [ 'id'     => $wgDBprefix. 'table1.id', 'title'  => $wgDBprefix. 'table1.title', 'value1' => $wgDBprefix. 'table2.value', 'value2' => $wgDBprefix. 'table3.value', 'value3' => $wgDBprefix. 'table2.value - '. $wgDBprefix. 'table3.value', 'stuff'  => $wgDBprefix. 'table4.stuff' ],			'conds' => [ 'some_field' => 0, 'another_field' => 0, ],			'join_conds' => [ 'table2' => [ 'INNER JOIN', $wgDBprefix. 'table1.id = '. $wgDBprefix. 'table2.id' ],				'table3' => [ 'INNER JOIN', $wgDBprefix. 'table1.id = '. $wgDBprefix. 'table3.id' ],				'table4' => [ 'INNER JOIN', $wgDBprefix. 'table4.id = '. $wgDBprefix. 'table3.sub_id' ],				'table5' => [ 'INNER JOIN', $wgDBprefix. 'table5.id = '. $wgDBprefix. 'table1.id AND '. $wgDBprefix. 'table5.date = \''. $myDate. '\''				]			]		];	} I hope you will find this helpful. Enjoy! --5.83.136.21 15:23, 25 October 2019 (UTC)

How do I make queries with OR
Is this possible?
 * See code of  in , something like:

--Andreas P. 13:02, 23 January 2013 (UTC)

immediateBegin vs. begin
This document suggests using immediateBegin over begin, as begin 'may not do what you expect'. Same for immediateCommit vs. commit. However, the MW source code specifies that both immediateBegin and begin do the same thing, and that immediateBegin is deprecated in favor of begin. Same for commit. Unless anyone objects, I'm going to change the page to state that begin and commit should be used, and that the immediateX have been deprecated. --Msul01 20:50, 12 December 2008 (UTC)

ORDER BY and GROUP BY
Hi, i want to roder an group my query but it doesn't work. here my source $res = $dbr->select(                                       'flaggedrevs',                                        array( 'fr_page_id', 'fr_user' , 'fr_timestamp' , 'fr_comment' , 'fr_quality' ),                                        ,                                        ,                                        'GROUP BY fr_page_id ORDER BY fr_timestamp DESC'                );

the query it self is ok, but it seems like the order and group option does not work. hre the raw query from the cli

 select fr_timestamp,fr_user,fr_page_id,fr_quality from flaggedrevs group by fr_page_id order by fr_timestamp DESC; 

it works perfectly there. is there another way to query the database? This API seems to be not very efficient. --213.214.18.64 18:06, 20 January 2010 (UTC)


 * Max Semenik 18:34, 20 January 2010 (UTC)
 * thank you --213.214.18.64 16:29, 21 January 2010 (UTC)

CREATE TABLE
There should be a wrapper for CREATE TABLE. (Other than a duplicate of an existing table) Tisane 07:05, 12 March 2010 (UTC)

$res - and then?
There should be more documentation about what to do with the result of a query. I found a pointer to DatabaseType::fetchObject($res), but do I need to use it? What if there will always only be one result row? --Patrick Nagel 04:06, 4 May 2011 (UTC)
 * It is mentioned at the very beginning: foreach( $res as $row ) { ... }. Max Semenik 04:53, 4 May 2011 (UTC)
 * If there is one result row always use selectRow instead --Nischayn22 (talk) 12:19, 16 May 2012 (UTC)

Needed: Structure and better examples/explanation
This whole thing needs better structure and organisation. As it is it seems quite... all over the place, really. If I could be more specific, however, I could probably fix it myself, but unfortunately my sheer lack of comprehension precludes this. That, and bats are eating my legs.

In particular with the database query wrapper functions, much more is also needed to actually explain what's going on. The select example is helpful, but nowhere near enough to give an idea what all can be done with the things or how, nor is it clear if the listed ones are even all of them, considering how the last has the contents commented out? Quite possibly there should be an entire separate page for these, however, as with each parameter there are different formats and options, not all of which parallel what would be expected from the SQL. For instance, would there even be any way to do 'ORDER BY 0,5' given that ORDER BY apparently only takes numbers? But that it only does take numbers is the sort of information a user needs, and that other things take arrays, and what all things take arrays... -— Isarra ༆ 17:01, 22 November 2012 (UTC)
 * Or was that LIMIT? I can't remember. -— Isarra ༆ 17:03, 22 November 2012 (UTC)
 * I agree, the DB access functions could be much better documented (not to mention are a common source of confusion when people first start developing MediaWiki).
 * In regards to LIMIT. Different DB's handle how offsets are specified differently. In the MW db access functions we use a separate (fake) option called OFFSET to specify this. There is a (very complex) example in intersection extension. There are few examples of offset in MediaWiki, as generally using an offset paramter (instead of doing offset in the where cluase by saying something along the line AND SOME_FIELD > SOME_NUMBER) is rather inefficient. For example, the code would look like:

$dbr = wfGetDB( DB_SLAVE ); $res = $dbr->select(       'category',                                   // $table        array( 'cat_title', 'cat_pages' ),            // $vars (columns of the table)        'cat_pages > 0',                              // $conds        __METHOD__,                                   // $fname = 'Database::select',        array( 'ORDER BY' => 'cat_title ASC', 'LIMIT' => 10, 'OFFSET' => 5 )       ); Which would produce something like: SELECT cat_title, cat_pages FROM category WHERE cat_pages > 0 ORDER BY cat_title ASC LIMIT 5, 10; (And yes, I recognize there is absolutely no way to figure that out unless somebody tells you :S) Bawolff (talk) 22:21, 22 November 2012 (UTC)
 * Aye, that's exactly my point. But you having put out this example, there's now that much more for folks to go off... hopefully someone will eventually do a more thorough introduction, but this is still something, so thank you. -— Isarra ༆ 04:49, 23 November 2012 (UTC)

Distinct
is there a way to add DISTINCT to select? --Hiong3-eng5 (talk) 07:29, 5 August 2013 (UTC)


 * never mind, I got it. use as ...

$cond[] = 'DISTINCT'; --Hiong3-eng5 (talk) 08:20, 5 August 2013 (UTC)

Choosing whether to use the master vs. slave
The manual page says to use the master for "read queries that need to have absolutely newest information". Does anyone have any tips on what kinds of situations require we be that particular about having the newest info, and what kinds of situations we can afford to have some lag in? Is the master to be read from pretty much whenever you plan on writing new data to the database based on something you read from it? Leucosticte (talk) 22:57, 13 December 2013 (UTC)

How can I work with two tables with the same column name?
I try to solve the following problem:

Table 1: 'page'

page_id

page_title

page_namepsace

...

Table 2: 'page_ext'

page_id

page_date

...

The SQL string:

SELECT page.page_id as page_id, page_ext.page_id as page_id2 page.page_title as page_title, FROM page, page_ext WHERE page.page_id=page_ext.page_id'

How can I do it with the array syntax?

--Mchansy (talk) 13:09, 27 January 2015 (UTC)

This might help:

array( 'a' => 'user' )

This includes the user table in the query, with the alias "a" available for use in field names (e.g. a.user_name).

Reference: https://doc.wikimedia.org/mediawiki-core/master/php/classDatabaseBase.html

How to select count from database
Hello,

I will like to select the most number of page_id in database by using COUNT. But I am not sure of the syntax used. Any advise on this? What I have tried

private function getMostLikesArticle {			$dbr = wfGetDB( DB_SLAVE ); $res = $dbr->select(			$this->tableName,                               			array(COUNT(page_id),'page_id', 'user_id'),			array('page_id ' => $this->page->getId),					//'user_id > 0',                             			__METHOD__                                   				); $output = ''; foreach( $res as $row ) { $output .= 'Page ID '. $row->page_id. ' has '. $row->user_id. " entries.\n" ; echo "output is ". $output. ' ';		} }

Passing database options upon creation of tables
Is there a method to specify to the MW code that specific options must be passed to the RDBMS upon creation of new tables, not only at the moment of installation, but at any later point as well? For example an options string to be amended to the CREATE TABLE statement.

My specific use-case is that I would like to implement MariaDB's native table encryption which could only be effected either globally (in all tables in the instance) or on individual tables. There's no method I could find that effects this on the level of certain database scehmas but not others.

I want to encrypt some mediawiki instances but not other applications whose tables are managed by the same instance. Virtualisation is infeasible in terms of disk ustilisation for my use-case.

--Ahmad Gharbeia أحمد غربية (talk) 11:56, 10 April 2016 (UTC)


 * Maybe you can use $wgDBTableOptions --Ciencia Al Poder (talk) 21:02, 10 April 2016 (UTC)
 * These are "options to use during installation or update". Besides, this option accepts only a value from a set of predefined ones, as far as I understood. --Ahmad Gharbeia أحمد غربية (talk)

Lock contention on MW 1.31. causes "Wikimedia\\Rdbms\\Database::begin: Implicit transaction already active"
Is the following pattern still working for MW 1.31?

$dbw = wfGetDB( DB_MASTER ); $dbw->begin; /* Do queries */ $dbw->commit;

I see "Wikimedia\\Rdbms\\Database::begin: Implicit transaction already active" since MW 1.31 for https://github.com/gesinn-it/IDProvider/blob/6bb622f3e5d23ab87a7e37311b7a432795d51b8c/IDProvider.functions.php#L90

--Planetenxin (talk) 08:39, 7 August 2018 (UTC)
 * @User:Planetenxin, it did not work for me either. I've updated the doc with a functioning example. Tinss (talk) 04:17, 14 December 2018 (UTC)

Performing and SQL JOIN in MW 1.31.
I'm having real issues understanding how to perform an SQL JOIN using the select function. I don't understand the documentation description and examples. Can anyone help? I've put more details on StackOverflow.

Thanks.

Recommendation against still relevant?
Ever since the initial commit of by (which was eventually turned into this wiki page), the text has advised against using   and other locking reads, claiming that they are poorly implemented in InnoDB and will cause regular deadlock errors. However, we now use  a lot more in MediaWiki core than back in 2006 (and so do plenty of Wikimedia deployed extensions, according to code search); also, more recently  added tips to Database optimization which note that   may  be needed to avoid race conditions.

Does anyone know more details about the InnoDB problems with locking reads, and/or whether they’re still relevant? The current state of this documentation is confusing in my opinion, and I suspect that the part on this page (cautioning against them) might simply be outdated. (The  setting mentioned in the initial   commit was removed in 1.25.0, though  doesn’t explain why, so I don’t know if it was no longer necessary to avoid locking reads or if that happened for some other reason.) --Lucas Werkmeister (WMDE) (talk) 12:22, 10 May 2019 (UTC)

Article incomplete on query result/error condition
What I'm missing here is any kind of error handling. For Example, if my "$res = $dbr->select" fails, do I have to catch it? Or does $res tell me anything about it?

I think this is an important point for anyone coming here for help. --ChTreczoks (talk) 09:57, 11 August 2019 (UTC)