User:Catrope/Extension review/Translate

Base revision:

At least initially, this is mostly a quick security/performance review, so I'll mostly be focusing on sensitive areas and may skip or just glance over things that seem less important. If that's the case, it'll be noted. Of course I may still note things that are less important just because I happen to notice them.

✅ _autoload.php

 * The MediaWiki core classes (line 53-54) should just be in phase3/includes/AutoLoader.php . They'll do no harm there, and an extension autoloading core files is very weird
 * Added to core in 98528, kept in _autoload for BC. – Nikerabbit 15:49, 30 September 2011 (UTC)

✅ check-blacklist.php
This is just data, skipping. Looks like some of this is TWN-specific, though.
 * Filed as 31281. Will be dealt with later by adding it in the YAML configuration. Should not be in the way for now. – Nikerabbit 16:08, 30 September 2011 (UTC)

FFS.php

 * ✅ BAD: The file format used in SimpleFFS doesn't seem to account for the fact that  is a valid character in message keys, and it seems that if I were to create MediaWiki:Foo=bar and put 'baz' in it (which works, I tried on my localhost), the writer would write something like   and the reader would read that as  . Of course, this is ambiguous: if I create MediaWiki:Foo and put 'bar=baz' in it, the writer would also write  . It looks like JavaFFS has the same problem.
 * Filed as 31300 - not going to be used in WMF and hasn't caused problems yet.
 * ✅ Line 457 (parse authors list): why are you skipping the first 6 characters? You should add a comment saying what the number 6 is for, specifically
 * Line 508-513 (concatenate separated strings): this doesn't account for single quotes
 * Why don't you just use real JSON here instead of fragile hacks?
 * I have to ask robert if there is some reason to do it this way. – Nikerabbit 19:07, 1 October 2011 (UTC)
 * Also, not in a WMF code path.
 * Didn't review YamlFFS, RubyYamlFFS other than verifying that they don't use the SPYC function that Tim showed to have security problems relating to filesystem access
 * ✅ BAD: Line 1074: Use wfShellExec so memory limits and such are set, and escape $filename. Looks like it doesn't need shellarg escaping, but does seem to need single and double quote escaping. Right now this looks vulnerable to shell command injection AND Python code injection
 * Needed shellarg escaping too.
 * ✅ Line 1075: don't use json_decode, use FormatJson::decode
 * Didn't review the rest of PythonSingleFFS too much, didn't look scary

✅ FirstSteps.i18n.php
i18n, not scary

✅ Groups.php
Skimmed this, doesn't look scary

✅ MessageChecks.php

 * BAD: Line 397: error code and message from libxml are put into HTML unescaped
 * It does go through the wikitext parser.
 * Skimmed over this file, didn't see anything else scary

✅ MessageCollection.php
DB queries look good. Skimmed over the rest, saw nothing scary.

MessageGroups.php

 * Line 313: I feel this should check the validity of $code before returning a filename that will probably be used
 * ✅ Line 736: It's, not
 * Fixed in 98592.
 * Looks good, thanks.
 * ✅ BAD: Line 752: You can't ORDER BY trs_order if that field isn't in any index, that'll almost certainly filesort. To support this query, there needs to be an index on (trs_page, trs_order). You could repurpose the existing and redudant (because it's an initial substring of another index, namely the primary key) for this.
 * ✅ Line 920: I think this GROUP BY page_id will cause a filesort, but I don't have the means to check that. At any rate, it seems unnecessary because I don't see how this query could possibly return the same page twice. Could you run  on TWN with and without the GROUP BY and see what you get?
 * Line 944: Whoa, there's autoload listings in the YAML file? What's going on here?
 * We agreed on IRC that this is a bit evil but okay. – Nikerabbit 22:32, 1 October 2011 (UTC)

Message.php
This all looks very simple and harmless.

PageTranslation.i18n.php
i18n, harmless

RcFilter.php
TODO