Manual talk:Database access

From MediaWiki.org
Jump to navigation Jump to search

Thanks for starting this page.

ISNULL[edit]

  • ISNULL(t.field,substitute_value) -> MSSQL
  • IFNULL(t.field,substitute_value) -> MySQL
  • COALESCE(t.field,substitute_value) -> PostgreSQL

How do I manage this? --5.83.136.21 17:33, 25 October 2019 (UTC)

COALESCE is standard and should work in all RDBMS --Ciencia Al Poder (talk) 15:18, 26 October 2019 (UTC)

Not very well described[edit]

Where does it say what does "fname" or "options" or "vars" mean?

I believe those are described here. Tisane 20:11, 26 February 2010 (UTC)
I agree like almost all wikipedia technical and mediawiki articles, this article is terribly written, and difficult to understand for those who are not computer programmers. Errectstapler 14:03, 4 May 2011 (UTC)
I added a much needed explanation of how to login to mysql.[1] Errectstapler 18:24, 4 May 2011 (UTC)

The information was reverted, here it is for others:

==Logging in to MySQL==

In localsettings.php will be your MySQL password and username, for example:

## Database settings
$wgDBtype           = "mysql";
$wgDBserver         = "localhost";
$wgDBname           =  yourwebsite
$wgDBuser           =  yourwebsite
$wgDBpassword       = LongPassword  

With PuTTY (a telnet/ssh client), login by entering the following:

mysql -u <$wgDBuser> -p --database=<$wgDBname>

Replacing <$wgDBuser> and <$wgDBname> with the localsettings.php information. You will then be prompted for your password (wgDBpassword).

You will then see the mysql> prompt.

Errectstapler 19:56, 4 May 2011 (UTC)

SQL Injection[edit]

How safe are these functions against SQL Injection? Does the extension-coder have to take care of this themselves? — Preceding unsigned comment added by 194.176.105.40 (talkcontribs)

If you feed an array to select, it's safe. If you construct a string, even a string to be passed to ::select, then it's up to you to take care of safety. Gigs 03:07, 30 December 2010 (UTC)

select-statement[edit]

Where would I find documentation on the select-statement (no, not sql in general)?

I would assume:

$res = $dbr->select(Tables, Fields returned, WHERE-clause, calling function, options);

Tables
array of tables, if more then on, where-clause needs a join
Example: array('pagelinks', 'page') or just 'page'
Fields returned
array of fields returned
Example: 'pl_title'
WHERE-clause
array which holds all conditions, will be joined with AND (And how do I OR clauses?)
Example: array('page_id = pl_from', 'pl_namespace' => NS_TEMPLATE, 'page_namespace' => $disPageObj->getNamespace(), 'page_title' => $disPageObj->getDBkey())
calling function
just for the fun of it?
Example: 'DisambiguationsPage::getSQL'
options
seems to be added after the WHERE clause, like LIMIT or ORDER BY

Full example:

$res = $dbr->select(array('pagelinks', 'page'),
	  	    'pl_title',
	             array('page_id = pl_from', 'pl_namespace' => NS_TEMPLATE,
                     'page_namespace' => $disPageObj->getNamespace(), 'page_title' => $disPageObj->getDBkey()),
		    'DisambiguationsPage::getSQL' );

— Preceding unsigned comment added by 85.177.206.125 (talkcontribs)

The WHERE clause is hard coded to use AND and only do equality when you give it an array. An important caveat is that if you give a string instead of an array, you have to addQuotes and do all your sanitation manually. I have updated the page accordingly. Gigs 03:02, 30 December 2010 (UTC)

Multiple tables with JOIN[edit]

How can I do a nested thing like this

SELECT
    Orders.OrderID,
    Customers.CustomerName,
    Shippers.ShipperName
FROM (
    (Orders
INNER JOIN
    Customers
ON
    Orders.CustomerID = Customers.CustomerID)
INNER JOIN
    Shippers
ON
    Orders.ShipperID = Shippers.ShipperID);

in the select(...) style? --5.83.136.21 00:35, 25 October 2019 (UTC)

Solution[edit]

I DID IT!!!

The situation was the following:

CREATE TABLE `table1` (
  `id` int ...,
  `title` varchar ...,
  ...
);
CREATE TABLE `table2` (
  `id` int ...,
  `value` int ...,
  ...
);
CREATE TABLE `table3` (
  `id` int ...,
  `sub_id` int ...,
  ...
);
CREATE TABLE `table4` (
  `sub_id` int ...,
  `stuff` varchar ...,
  ...
);
CREATE TABLE `table5` (
  `id` int
  `value` int ...,
  ...
);

I've worked out a SQL statement that does what I want:

SELECT
    table1.id AS `id`,
    table1.title AS `title`,
    table2.value AS `value1`,
    table3.value AS `value2`,
    table2.value - table3.value AS `value3`,
    DATE AS `date`,
    table4.stuff AS `stuff`
FROM
    (
        (
            (
                (
                    `table1`
                INNER JOIN `table2` ON table1.id = table2.id
                )
            INNER JOIN `table3` ON table1.id = table3.id
            )
        INNER JOIN `table4` ON table3.sub_id = table4.sub_id
        )
    INNER JOIN `table5` ON table5.id = table1.id AND table5.date = '2018-12-31'
    )
WHERE
    ...
ORDER BY
    table2.value - table3.value

The result is:

+----+-------+---------+---------+---------------------+------+-------+
| ID | Title | Value 1 | Value 2 | (Value 2 - Value 1) | Date | Stuff |
+----+-------+---------+---------+---------------------+------+-------+

My problem was how to insert this SQL statement into the array sets. I figured that out:

	public static function getQueryInfo() {

		return [
			'tables' => [ 'table1', 'table2'<sup>2a</sup>, 'table3'<sup>3a</sup>, 'table4'<sup>4a</sup>, 'table5'<sup>5a</sup> ],
			'fields' => [
				'id'      => 'table1.id',
				'title'   => 'table1.title',
				'value1'  => 'table2.value',
				'value2'  => 'table3.value',
				'value3'  => 'table2.value - table3.value,
				'stuff'   => 'table4.stuff'
			],
			'conds' => [
				'some_field' => 0,
				'another_field' => 0,
			],
			'join_conds' => [
				'table2'<sup>2b</sup> => [
					'INNER JOIN',
					'table1.id = ' .
					'table2.id'
				],
				'table3'<sup>3b</sup> => [
					'INNER JOIN',
					'table1.id = ' .
					'table2.id'
				],
				'table4'<sup>4b</sup> => [
					'INNER JOIN',
					'table3.sub_id = ' .
					'table4.sub_id'
				],
				'table5'<sup>5b</sup> => [
					'INNER JOIN',
					'table5.id = ' .
					'table1.id AND ' .
					'table1.date = \'2001-01-01\''
				]
			]
		];
	}

  1. The nested JOINS are written one after the other, separated by commas. That's easy, if you know that! ;-)
  2. The order of 2a, 3a, 4a, 5a and 2b, 3b, 4b, 5b is decisive. That's hard to know! I went as mad as hell because it didn't work, because I had a different order in the 'tables' array.
  3. Do not forget $wgDBprefix. So I finally got that:
	public static function getQueryInfo( $myDate ) {
		global $wgDBprefix;

		return [
			'tables' => [ 'table1', 'table2', 'table3', 'table4', 'table5' ],
			'fields' => [
				'id'      => $wgDBprefix . 'table1.id',
				'title'   => $wgDBprefix . 'table1.title',
				'value1'  => $wgDBprefix . 'table2.value',
				'value2'  => $wgDBprefix . 'table3.value',
				'value3'  => $wgDBprefix . 'table2.value - ' . $wgDBprefix . 'table3.value',
				'stuff'   => $wgDBprefix . 'table4.stuff'
			],
			'conds' => [
				'some_field' => 0,
				'another_field' => 0,
			],
			'join_conds' => [
				'table2' => [
					'INNER JOIN',
					$wgDBprefix . 'table1.id = ' .
					$wgDBprefix . 'table2.id'
				],
				'table3' => [
					'INNER JOIN',
					$wgDBprefix . 'table1.id = ' .
					$wgDBprefix . 'table3.id'
				],
				'table4' => [
					'INNER JOIN',
					$wgDBprefix . 'table4.id = ' .
					$wgDBprefix . 'table3.sub_id'
				],
				'table5' => [
					'INNER JOIN',
					$wgDBprefix . 'table5.id = ' .
					$wgDBprefix . 'table1.id AND ' .
					$wgDBprefix . 'table5.date = \'' . $myDate . '\''
				]
			]
		];
	}

I hope you will find this helpful. Enjoy! --5.83.136.21 15:23, 25 October 2019 (UTC)

How do I make queries with OR[edit]

Is this possible?

See code of DataBase::makeList in wikipath/includes/db/Database.php, something like:
$dbReadOnly = wfGetDB( DB_SLAVE );
$dbReadOnly->select(
 /* table */ 'table',
 /* fields*/ array('tablefield'),
 /* conds */ $dbReadOnly->makeList($arrayOfVariables, LIST_OR),
 /* fname */ __METHOD__
 /* options */
);

--Andreas P. Icon External Link E-Mail.png 13:02, 23 January 2013 (UTC)

immediateBegin() vs. begin()[edit]

This document suggests using immediateBegin() over begin(), as begin() 'may not do what you expect'. Same for immediateCommit() vs. commit(). However, the MW source code specifies that both immediateBegin() and begin() do the same thing, and that immediateBegin() is deprecated in favor of begin(). Same for commit(). Unless anyone objects, I'm going to change the page to state that begin() and commit() should be used, and that the immediateX() have been deprecated. --Msul01 20:50, 12 December 2008 (UTC)

ORDER BY and GROUP BY[edit]

Hi, i want to roder an group my query but it doesn't work. here my source

                $res = $dbr->select(
                                        'flaggedrevs',
                                        array( 'fr_page_id' , 'fr_user' , 'fr_timestamp' , 'fr_comment' , 'fr_quality' ),
                                        '',
                                        '',
                                        'GROUP BY fr_page_id ORDER BY fr_timestamp DESC'
                );

the query it self is ok, but it seems like the order and group option does not work. hre the raw query from the cli

select fr_timestamp,fr_user,fr_page_id,fr_quality from flaggedrevs group by fr_page_id order by fr_timestamp DESC;

it works perfectly there. is there another way to query the database? This API seems to be not very efficient. --213.214.18.64 18:06, 20 January 2010 (UTC)

   $res = $dbr->select( 'flaggedrevs', 
      array( 'fr_page_id' , 'fr_user' , 'fr_timestamp' , 'fr_comment' , 'fr_quality' ),  
      '', '', 
      array( 'GROUP BY' => 'fr_page_id' , 'ORDER BY' => 'fr_timestamp DESC' ) );
Max Semenik 18:34, 20 January 2010 (UTC)
thank you --213.214.18.64 16:29, 21 January 2010 (UTC)

CREATE TABLE[edit]

There should be a wrapper for CREATE TABLE. (Other than a duplicate of an existing table) Tisane 07:05, 12 March 2010 (UTC)

$res - and then?[edit]

There should be more documentation about what to do with the result of a query. I found a pointer to DatabaseType::fetchObject($res), but do I need to use it? What if there will always only be one result row? --Patrick Nagel 04:06, 4 May 2011 (UTC)

It is mentioned at the very beginning: foreach( $res as $row ) { ... }. Max Semenik 04:53, 4 May 2011 (UTC)
If there is one result row always use selectRow instead --Nischayn22 (talk) 12:19, 16 May 2012 (UTC)

Needed: Structure and better examples/explanation[edit]

This whole thing needs better structure and organisation. As it is it seems quite... all over the place, really. If I could be more specific, however, I could probably fix it myself, but unfortunately my sheer lack of comprehension precludes this. That, and bats are eating my legs.

In particular with the database query wrapper functions, much more is also needed to actually explain what's going on. The select example is helpful, but nowhere near enough to give an idea what all can be done with the things or how, nor is it clear if the listed ones are even all of them, considering how the last has the contents commented out? Quite possibly there should be an entire separate page for these, however, as with each parameter there are different formats and options, not all of which parallel what would be expected from the SQL. For instance, would there even be any way to do 'ORDER BY 0,5' given that ORDER BY apparently only takes numbers? But that it only does take numbers is the sort of information a user needs, and that other things take arrays, and what all things take arrays... -— Isarra 17:01, 22 November 2012 (UTC)

Or was that LIMIT? I can't remember. -— Isarra 17:03, 22 November 2012 (UTC)
I agree, the DB access functions could be much better documented (not to mention are a common source of confusion when people first start developing MediaWiki).
In regards to LIMIT. Different DB's handle how offsets are specified differently. In the MW db access functions we use a separate (fake) option called OFFSET to specify this. There is a (very complex) example in intersection extension. There are few examples of offset in MediaWiki, as generally using an offset paramter (instead of doing offset in the where cluase by saying something along the line AND SOME_FIELD > SOME_NUMBER) is rather inefficient. For example, the code would look like:
$dbr = wfGetDB( DB_SLAVE );
$res = $dbr->select(
        'category',                                   // $table
        array( 'cat_title', 'cat_pages' ),            // $vars (columns of the table)
        'cat_pages > 0',                              // $conds
        __METHOD__,                                   // $fname = 'Database::select',
        array(
              'ORDER BY' => 'cat_title ASC',
              'LIMIT' => 10,
              'OFFSET' => 5
        )        
);

Which would produce something like:

SELECT cat_title, cat_pages FROM category WHERE cat_pages > 0 ORDER BY cat_title ASC LIMIT 5, 10;

(And yes, I recognize there is absolutely no way to figure that out unless somebody tells you :S) Bawolff (talk) 22:21, 22 November 2012 (UTC)

Aye, that's exactly my point. But you having put out this example, there's now that much more for folks to go off... hopefully someone will eventually do a more thorough introduction, but this is still something, so thank you. -— Isarra 04:49, 23 November 2012 (UTC)

Distinct[edit]

is there a way to add DISTINCT to select? --Hiong3-eng5 (talk) 07:29, 5 August 2013 (UTC)

never mind, I got it. use as ...
$cond[] = 'DISTINCT';

--Hiong3-eng5 (talk) 08:20, 5 August 2013 (UTC)

Choosing whether to use the master vs. slave[edit]

The manual page says to use the master for "read queries that need to have absolutely newest information". Does anyone have any tips on what kinds of situations require we be that particular about having the newest info, and what kinds of situations we can afford to have some lag in? Is the master to be read from pretty much whenever you plan on writing new data to the database based on something you read from it? Leucosticte (talk) 22:57, 13 December 2013 (UTC)

How can I work with two tables with the same column name?[edit]

I try to solve the following problem:

Table 1: 'page'
page_id
page_title
page_namepsace
...

Table 2: 'page_ext'
page_id
page_date
...

The SQL string:

SELECT 
   page.page_id as page_id, 
   page_ext.page_id as page_id2 
   page.page_title as page_title, 
FROM page, page_ext
WHERE page.page_id=page_ext.page_id'

How can I do it with the array syntax?
--Mchansy (talk) 13:09, 27 January 2015 (UTC)

This might help:

array( 'a' => 'user' )

This includes the user table in the query, with the alias "a" available for use in field names (e.g. a.user_name).

Reference: https://doc.wikimedia.org/mediawiki-core/master/php/classDatabaseBase.html

How to select count from database[edit]

Hello,

I will like to select the most number of page_id in database by using COUNT. But I am not sure of the syntax used. Any advise on this? What I have tried

private function getMostLikesArticle()
	{	
		$dbr = wfGetDB( DB_SLAVE );
		$res = $dbr->select(
			$this->tableName(),                                
			array(COUNT(page_id),'page_id', 'user_id'),
			array('page_id ' => $this->page->getId()),		
			//'user_id > 0',                             
			__METHOD__                                   		
		);        
		
		$output = '';
		foreach( $res as $row ) {
				$output .= 'Page ID ' . $row->page_id . ' has ' . $row->user_id . " entries.\n" ;
				echo "output is " . $output . '<br/>';
		}
}

Passing database options upon creation of tables[edit]

Is there a method to specify to the MW code that specific options must be passed to the RDBMS upon creation of new tables, not only at the moment of installation, but at any later point as well? For example an options string to be amended to the CREATE TABLE statement.

My specific use-case is that I would like to implement MariaDB's native table encryption which could only be effected either globally (in all tables in the instance) or on individual tables. There's no method I could find that effects this on the level of certain database scehmas but not others.

I want to encrypt some mediawiki instances but not other applications whose tables are managed by the same instance. Virtualisation is infeasible in terms of disk ustilisation for my use-case.

--Ahmad Gharbeia أحمد غربية (talk) 11:56, 10 April 2016 (UTC)

Maybe you can use $wgDBTableOptions --Ciencia Al Poder (talk) 21:02, 10 April 2016 (UTC)
These are "options to use during installation or update". Besides, this option accepts only a value from a set of predefined ones, as far as I understood. --Ahmad Gharbeia أحمد غربية (talk)

Lock contention on MW 1.31. causes "Wikimedia\\Rdbms\\Database::begin: Implicit transaction already active"[edit]

Is the following pattern still working for MW 1.31?

$dbw = wfGetDB( DB_MASTER );
$dbw->begin();
/* Do queries */
$dbw->commit();

I see "Wikimedia\\Rdbms\\Database::begin: Implicit transaction already active" since MW 1.31 for https://github.com/gesinn-it/IDProvider/blob/6bb622f3e5d23ab87a7e37311b7a432795d51b8c/IDProvider.functions.php#L90

--Planetenxin (talk) 08:39, 7 August 2018 (UTC)

@User:Planetenxin, it did not work for me either. I've updated the doc with a functioning example. Tinss (talk) 04:17, 14 December 2018 (UTC)

Performing and SQL JOIN in MW 1.31.[edit]

I'm having real issues understanding how to perform an SQL JOIN using the select function. I don't understand the documentation description and examples. Can anyone help? I've put more details on StackOverflow.

Thanks.

Recommendation against FOR UPDATE still relevant?[edit]

Ever since the initial commit of docs/database.txt by Tim Starling (which was eventually turned into this wiki page), the text has advised against using FOR UPDATE and other locking reads, claiming that they are poorly implemented in InnoDB and will cause regular deadlock errors. However, we now use FOR UPDATE a lot more in MediaWiki core than back in 2006 (and so do plenty of Wikimedia deployed extensions, according to code search); also, more recently Aaron Schulz added tips to Database optimization which note that FOR UPDATE may be needed to avoid race conditions.

Does anyone know more details about the InnoDB problems with locking reads, and/or whether they’re still relevant? The current state of this documentation is confusing in my opinion, and I suspect that the part on this page (cautioning against them) might simply be outdated. (The $wgAntiLockFlags setting mentioned in the initial docs/database.txt commit was removed in 1.25.0, though Gerrit change 164115 doesn’t explain why, so I don’t know if it was no longer necessary to avoid locking reads or if that happened for some other reason.) --Lucas Werkmeister (WMDE) (talk) 12:22, 10 May 2019 (UTC)

Article incomplete on query result/error condition[edit]

What I'm missing here is any kind of error handling. For Example, if my "$res = $dbr->select()" fails, do I have to catch it? Or does $res tell me anything about it?

I think this is an important point for anyone coming here for help. --ChTreczoks (talk) 09:57, 11 August 2019 (UTC)

Too Much Political Correctness[edit]

I understand why you guys introduced DB_REPLICA instead of DB_SLAVE. But was it really really necessary to remove the old constant altogether in MediaWiki 1.34 thus breaking existing extensions? There is no such thing as backward compability any more? Alex Mashin (talk) 15:50, 30 January 2020 (UTC)

The change git #0653e575 includes a very clear commit message and a very clear update to RELEASE-NOTES: "The constant DB_SLAVE, deprecated in 1.28, has been removed. Use DB_REPLICA.". We followed our published deprecation policy. --BDavis (WMF) (talk) 22:08, 30 January 2020 (UTC)