User:Catrope/Extension review/CollabWatchlist

Base revision: r95194

CollabWatchlist.php

 * Config variables should be clearly separated near the top of the file, before any setup code. They should also have clearer documentation: the comment for  is very vague, and two others don't have a doc comment at all
 * Fixed in r99867 User:Flohack
 * The  functions should be static functions in a   class, so as to avoid putting functions in the global namespace
 * Fixed in r99867 User:Flohack
 * The defined constants ( and friends) should also be in a class
 * Fixed in r99867 User:Flohack
 * Support for MW 1.16 and below in  is pointless because you're using ResourceLoader, which is in MW 1.17+ only
 * Fixed in r99867 User:Flohack
 * The noctid SQL should only be run if you detect an old schema. Is it safe to assume that you're the only one who ever had that old schema?
 * How would I detect an old schema? Is it ok to run SQL commands within that function?!? User:Flohack

CollabWatchlistTest.php

 * isn't really a unit test. All it seems to do is check that  doesn't throw any exceptions, fatals or warnings, and let the user visually inspect the tree. A real unit test has assertions, and tests only a small piece of code (e.g. one function)
 * It was not intended to be a real unit test. I have added a markTestSkipped call in the setUp method in r99867. User:Flohack

sql directory

 * Please combine all tables into a single SQL file
 * But $updater->addExtensionUpdate does not seem to support adding multiple tables in one sql file. I have seen several extensions, only doing one addExtensionUpdate call with one addTable, when in fact the sql file contains multiple tables. I'd consider that bad practice, as the code suggests it creates only one table, when in fact it creates several. Do you agree on that? User:Flohack
 * Why does the  table use   as its field prefix?   would make more sense
 * I added an sql file which renames that in r99867. See related question above, regarding detecting an old schema. User:Flohack
 * Put  outside of the table creation for SQLite compatibility. See the coding conventions page
 * Fixed in r99867 User:Flohack
 * Use field prefixes consistently. In particular, don't use the name  for a field that is a foreign key to , because the entire point of using field prefixes is to avoid having to use syntax like   in queries to disambiguate your field names
 * ENUMs are evil, don't use them in new code. It's much easier to define numerical constants in PHP (e.g. ) and put those in an integer field in your DB table
 * Fixed in r99867 User:Flohack
 * Why does  have an auto-incremented primary key field? Can't you just drop that and add a unique index on   ?
 * The reason for the primary key is that one user can have several roles (rlu_type) within one list.
 * Same for  and
 * Removed the primary keys in r99867 User:Flohack
 * Comment on line 3 of  seems to have been cut off
 * See related question above, regarding detecting an old schema. User:Flohack
 * What is the  table for? It's not well documented (nor are any of the other tables, BTW, but those were easier to figure out by just reading the field names and the comments)
 * Fixed in r99867 User:Flohack

SpecialCollabWatchlist.php

 * Yeah, the grandparent constructor thing is nasty but it's the way things work with special pages, unfortunately
 * This file duplicates lots of code from , or to be more precise an older version of that file. I'm not gonna bother reviewing most of this file until that duplication is cleaned up.
 * queries are evil because their running time is linear in the number of rows they need to scan, and this number is unbounded ( doesn't work the way you expect it to in this case)
 * There are a few things wrong with the tag filter query
 * The logic of how  triggers   /   seems backwards
 * Subqueries are not supported by MySQL 4.0, and we still have four 4.0 machines in our DB cluster. I'm not sure whether it would make sense for this subquery to be rewritten to a join, performance-wise, I'd have to ask. You're right that regexing tag_summary is a dead end: it's not just impractical, it's also a performance nightmare
 * The query is built with raw SQL, which you shouldn't be doing. Even with a subquery, you can use  to build your subquery, then inject it into the main query
 * Instead of using  you can just use
 * Don't use
 * In, you're using  , which is a loose comparison between an array and an integer. That's just evil
 * None of your queries are limited or paged. While this is a bug in the existing watchlist code (no paging so 20,000-page watchlists cause OOMs and timeouts and such), that doesn't mean it's OK to write new code that has the same problem
 * Don't use  if you've already demonstrated you know how to use
 * is probably unnecessary if you clean up the previous point, but it's also unnecessary because you can use

CollabWatchlistEditor.php

 * Instead of having a big switch-case statement with nine cases, break them out into separate functions
 * Don't use  to check for null
 * Code like  will cause a fatal error if an invalid title string is fed to  . Check the return value of   for null before calling methods on it
 * Don't use  it's really, really, really evil. Add a static function to your class instead
 * What's up with redirecting to a wiki page upon a permission error? That's very unconventional. There are built-in facilities like  and friends, use them
 * Throwing a permissions error or redirecting away upon an invalid token is really bad UX: a user who submits their form after their session expires wants to see a descriptive error message ( message in core), plus the form as they filled it out so they can resubmit the form right away. Destroying their form data is unacceptable
 * Assignments inside if conditions such as on lines 72 and 76 are discouraged in the coding conventions
 * calls, but this is not documented and is generally a bad idea
 * Again, do not use COUNT(*), especially if you only care about whether the count is >= 1. In that case, use LIMIT 1 in your query and check
 * is guaranteed to return an array, you don't have to check for  afterwards
 * In, the variables   and   seem to be used interchangeably in a very confusing way
 * In  you can use   to split a string on a pipe character instead of rolling your own using string manipulation
 * In, if you transform a string into a Title, cache that so you don't end up doing it twice
 * I believe you would need to i18n the parentheses in this function somehow but I'm not sure how that's supposed to be done
 * does not escape  and  . This is probably an XSS vulnerability
 * Four functions in a row whose sole purpose is to execute a COUNT(*) query. Don't use COUNT(*).
 * In  you don't need to check that there were >0 rows before you foreach over the result. Looping over an empty result is perfectly safe. Likewise, checking for   is unnecessary.   will either return a ResultWrapper object or bail out with an exception, you're not gonna get it to return false unless you manually disable DB error exceptions in your code
 * If your namespace and title come directly from the database, you can use  instead of
 * We have wrapper functions for INSERT SELECT and DELETE. You shouldn't build raw SQL there, or use prepared statements (no one does that in MW, I don't know why we even support that)
 * Line 885: do not use, use   instead. The former has nasty side effects if an   query parameter is set
 * Line 1010: timestamps that you're inserting into the database must be run through  . To obtain the current timestamp, just call it without any parameters. You cannot insert ISO 8601 timestamps into the database, that is incorrect even for MySQL (which uses TS_MW)
 * Line 1172, line 1202, line 1232:  is not escaped
 * Line 1210:  is not escaped
 * Line 1303: improper uses of  and   on the same line
 * Line 1317:  is not escaped

CollabWatchistChangesList.php

 * has a typo in its name
 * isn't used consistently. It's used on line 62, but on line 84  is used instead
 * inserts an undo link? How is that related to the feature set of this extension at all?
 * Line 213: don't use inline JavaScript, and don't use . Use jQuery instead, it's so much nicer
 * Line 254: don't use onFoo attributes, use jQuery
 * In : don't mess with   here, you really shouldn't need to. If your code breaks if you don't use it, you're doing something very wrong

CategoryTreeManip.php

 * Line 37: if the cache duration is gonna be a fixed constant, make it a constant like the cache keys
 * Line 66, 74, 102, 136: indentation is off. The coding conventions recommend you put single-statement if blocks in braces too
 * Line 94: don't use isset to check for null
 * Line 208, 290: so you have a variable that doesn't exist the first time you enter the loop, and you create it halfway through the first iteration. This is confusing and easily avoided by just initializing it to something like an empty array before you start the loop
 * Line 259: loose comparison between an array and an empty string is not a good idea
 * The query in  looks wrong, I don't see how it limits itself to child categories rather than category members. With the new DB schema you don't need to join against page and use   to get child categories anymore, you can just use  . Also, it seems the cp table is useless in this query, you can take category titles and WHERE them straight into
 * Line 292: this should probably be a strong comparison. Strings behave strangely, with things like  returning true
 * Line 341: you cannot use queries like  without a condition on , because:
 * It will return pages like  as well as ,  ,   etc etc (I'm assuming you only want categories here, so add
 * It will melt the database because the query is unindexed and the enwiki page table has over 20 million rows