Extension:AbuseFilter/Conditions

Essay: The condition limiter is a somewhat ad hoc tool for preventing performance problems. To the extent that you want to worry about performance, execution times are generally better measure to be thinking about. Also, the per filter reporting of condition numbers is completely wonky / broken and should not be considered accurate in any way, so don't necessarily rely on those numbers when identifying problems. (Unfortunately, the per filter time numbers are also somewhat broken.)

The condition limit is (more or less) tracking the number of boolean operands + number of function calls + number of function parameters + the number of parenthetical conditions entered. However, it is also smart enough to bypass functions and parenthetical groups if the value doesn't matter. For example, in the expression A & B, the details of B are only evaluated is A is true. For that reason it is beneficial to performance to put simple limiting conditions, e.g. checks for article namespace, in front of more complex expressions. Also, parentheses are usually your friend even though entering them can count against you. Lastly, I should note that function calls are cached, so they only add to the condition count the first time a specific function result is asked for.

General counting
From Extension:AbuseFilter/Rules format:

Example 1
For a practical example, consider filter 59:

article_namespace == 6 & !("autoconfirmed" in user_groups) & !(user_name in article_recent_contributors) & rcount ("\{\{.*\}\}", removed_lines) > rcount ("\{\{.*\}\}", added_lines)

This can be simplified as:


 * A & !(B) & !(C) & rcount( D, E ) > rcount( F, G )

Let's consider the branching chart:


 * A is true: new boolean operand, +1 condition
 * B is true: new boolean operand, and enter paren, +2 condition
 * A & !(B) is false, enter bypass mode
 * C is true / false: new boolean operand, skip paren, +1 condition
 * rcount expressions: new boolean operand, skip functions, +1 condition
 * Total: 5 conditions
 * B is false: new boolean operand, and enter paren, +2 condition
 * C is true: new boolean operand, and enter paren, +2 condition
 * A & !(B) & !(C) is false, enter bypass mode
 * rcount expressions: new boolean operand, skip functions, +1 condition
 * Total: 6 conditions
 * C is false: new boolean operand, and enter paren, +2 condition
 * rcount expressions: new boolean operand, evaluate D, E, F, and G, and evaluate rcount( D, E ) and rcount( F, G ), +7 conditions
 * Total: 12 conditions
 * A is false: new boolean operand, +1 condition
 * A is false, enter bypass mode
 * B is true / false: new boolean operand, skip paren, +1 condition
 * C is true / false: new boolean operand, skip paren, +1 condition
 * rcount expressions: new boolean operand, skip functions, +1 condition
 * Total: 4 conditions

So, that filter runs from 4 conditions if the first operation is false to 12 conditions if every operation must be evaluated.

Example 2
Now consider an alternative construction with explicit parentheses for groups and removing excess parentheses around the "in" operations:

article_namespace == 6 & (  ! "autoconfirmed" in user_groups &   ( ! user_name in article_recent_contributors & rcount ("\{\{.*\}\}", removed_lines) > rcount ("\{\{.*\}\}", added_lines) ) )

This can be simplified as:


 * A & ( ! B & ( ! C & rcount( D, E ) > rcount( F, G ) ) )

Let's consider the branching chart:


 * A is true: new boolean operand, +1 condition
 * B is true: new boolean operand, and enter paren, +2 condition
 * A & ! B is false, enter bypass mode
 * C is true / false: new boolean operand, skip paren, +1 condition
 * Total: 4 conditions
 * B is false: new boolean operand, and enter paren, +2 condition
 * C is true: new boolean operand, and enter paren, +2 condition
 * A & ! B & ! C is false, enter bypass mode
 * rcount expressions: new boolean operand, skip functions, +1 condition
 * Total: 6 conditions
 * C is false: new boolean operand, and enter paren, +2 condition
 * rcount expressions: new boolean operand, evaluate D, E, F, and G, and evaluate rcount( D, E ) and rcount( F, G ), +7 conditions
 * Total: 12 conditions
 * A is false: new boolean operand, +1 condition
 * A is false, enter bypass mode
 * B is true / false: new boolean operand, skip paren, +1 condition
 * Total: 2 conditions

So, that filter runs from 2 conditions if the first operation is false to 12 conditions if every operation must be evaluated. If the initial condition is rarely true, as article_namespace == 6 probably is, then the modified filter will consume only two conditions in most runs, compared to 4 conditions in the example without explicit parentheses. Stacking easy to evaluate but hard to match conditions at the front of a filter will generally improve run times and reduce condition usage. In most cases, the use of explicit parentheses also helps the edit filter parser more efficiently determine branching and also reduce both condition counts and runtimes.