User:Catrope/Extension review/Translate

From MediaWiki.org
Jump to: navigation, search

Base revision: r98505

At least initially, this is mostly a quick security/performance review, so I'll mostly be focusing on sensitive areas and may skip or just glance over things that seem less important. If that's the case, it'll be noted. Of course I may still note things that are less important just because I happen to notice them.

YesY Done _autoload.php[edit]

  • The MediaWiki core classes (line 53-54) should just be in phase3/includes/AutoLoader.php . They'll do no harm there, and an extension autoloading core files is very weird
    Added to core in rev:98528, kept in _autoload for BC. – Nikerabbit 15:49, 30 September 2011 (UTC)

YesY Done check-blacklist.php[edit]

This is just data, skipping. Looks like some of this is TWN-specific, though.

Filed as bugzilla:31281. Will be dealt with later by adding it in the YAML configuration. Should not be in the way for now. – Nikerabbit 16:08, 30 September 2011 (UTC)

FFS.php[edit]

  • YesY Postponed BAD: The file format used in SimpleFFS doesn't seem to account for the fact that = is a valid character in message keys, and it seems that if I were to create MediaWiki:Foo=bar and put 'baz' in it (which works, I tried on my localhost), the writer would write something like foo=bar=baz and the reader would read that as array( 'foo', 'baz=bar' ). Of course, this is ambiguous: if I create MediaWiki:Foo and put 'bar=baz' in it, the writer would also write foo=bar=baz. It looks like JavaFFS has the same problem.
    Filed as bugzilla:31300 - not going to be used in WMF and hasn't caused problems yet.
  • YesY R98620 Line 457 (parse authors list): why are you skipping the first 6 characters? You should add a comment saying what the number 6 is for, specifically
  • Line 508-513 (concatenate separated strings): this doesn't account for single quotes
  • Why don't you just use real JSON here instead of fragile hacks?
    I have to ask robert if there is some reason to do it this way. – Nikerabbit 19:07, 1 October 2011 (UTC)
    Also, not in a WMF code path.
  • Didn't review YamlFFS, RubyYamlFFS other than verifying that they don't use the SPYC function that Tim showed to have security problems relating to filesystem access
  • YesY R98623 BAD: Line 1074: Use wfShellExec() so memory limits and such are set, and escape $filename. Looks like it doesn't need shellarg escaping, but does seem to need single and double quote escaping. Right now this looks vulnerable to shell command injection AND Python code injection
    Needed shellarg escaping too.
  • YesY R98625 Line 1075: don't use json_decode(), use FormatJson::decode()
  • Didn't review the rest of PythonSingleFFS too much, didn't look scary

YesY Done FirstSteps.i18n.php[edit]

i18n, not scary

YesY Done Groups.php[edit]

Skimmed this, doesn't look scary

YesY Done MessageChecks.php[edit]

  • BAD: Line 397: error code and message from libxml are put into HTML unescaped
    It does go through the wikitext parser.
  • Skimmed over this file, didn't see anything else scary

YesY Done MessageCollection.php[edit]

DB queries look good. Skimmed over the rest, saw nothing scary.

MessageGroups.php[edit]

  • Line 313: I feel this should check the validity of $code before returning a filename that will probably be used
  • YesY Done Line 736: It's $wgLanguageCode, not $wgLanguagecode
    Fixed in rev:98592.
    Looks good, thanks.
  • YesY R98629 BAD: Line 752: You can't ORDER BY trs_order if that field isn't in any index, that'll almost certainly filesort. To support this query, there needs to be an index on (trs_page, trs_order). You could repurpose the existing and redudant (because it's an initial substring of another index, namely the primary key) for this.
  • YesY R98626 Line 920: I think this GROUP BY page_id will cause a filesort, but I don't have the means to check that. At any rate, it seems unnecessary because I don't see how this query could possibly return the same page twice. Could you run EXPLAIN SELECT * FROM page, revtag WHERE page_id=rt_page AND rt_type='tp:mark' GROUP BY page_id; on TWN with and without the GROUP BY and see what you get?
  • Line 944: Whoa, there's autoload listings in the YAML file? What's going on here?
    We agreed on IRC that this is a bit evil but okay. – Nikerabbit 22:32, 1 October 2011 (UTC)

Message.php[edit]

This all looks very simple and harmless.

PageTranslation.i18n.php[edit]

i18n, harmless

RcFilter.php[edit]

TODO