Visual diffs/2008 project

User:Guyvdb =Abstract= The goal of the project is to extend and improve the current diff page to render the changes in stead of showing the source diff. The resulting "parsed inline diff" would be very easy to use. The algorithm that will achieve this is based on the HTML diffing algorithm used in DaisyCMS.

=Deliverables=
 * A word based inline source diff
 * Implement the LCS algorithm - 7/8
 * Tokenize the wiki text and diff it - 7/10
 * Create a nice layout and integrate with MediaWiki - 7/15


 * A visual diff implementation in PHP
 * A parser that creates syntax trees suitable for diffing.
 * reuse the the preprocessor tree?
 * Translate the Daisy Diff algorithm to PHP - 7/22
 * Optimize the algorithm to support custom diff granularity and more generic node types. - 7/27
 * Integrate with MediaWiki - 8/5


 * An implementation optimized for speed and memory use, (partly) written in C++ and cached - 8/15


 * A visualization and integration suitable for MediaWiki and its users - 8/31


 * Documentation


 * Frequent Progress updates

I have most of September free as a buffer in case I don't make the deadlines.

=Code= The visual_diff branch contains the development code.

=Notes=

The LCS algorithm
The current implementation is an amalgam of different algorithms, heuristics and optimizations. It's difficult to understand and probably suboptimal. The splitting in chunks does not seem to guarantee maximum common subsequences. I will implement the LCS algorithm as described by Myers. By incorporating features introduced in the Eclipse implementation I can set an upper bound on the execution time. This feature is needed by DaisyDiff.