User:OrenBochman/Bugs

=Bugs Smashing=

list of bugs

 * https://bugzilla.wikimedia.org/buglist.cgi?query_format=advanced&list_id=69059&component=Lucene%20Search&resolution=---&resolution=LATER&resolution=DUPLICATE&product=MediaWiki%20extensions

patches

 * https://bugzilla.wikimedia.org/buglist.cgi?keywords=patch%2C%20need-review%2C%20&query_format=advanced&keywords_type=allwords&list_id=69057&component=Lucene%20Search&component=Search&resolution=---&resolution=LATER&resolution=DUPLICATE&product=MediaWiki&product=MediaWiki%20extensions

Analysis
Here are some approaches possible to implement this feature.


 * 1) Option 1:
 * 2) storing raw page's source in a   with unexpanded source and
 * 3) querying with a   and  .
 * 4) it will double the index size a WFTU per wiki.
 * 5) it will require UI change
 * 6) it will require its own ranking.
 * 7) Option 2:
 * 8) indexing and storing the page's parsed source in a  
 * 9) and querying with a   to search the source
 * 10) it would increase index by a factor of a WFTU.
 * 11) it will require UI change
 * 12) it will require its own ranking.
 * 13) option 3:
 * 14) indexing the page's parsed source in a flat  
 * 15) querying using a   which would provide markup search capability.
 * 16) it would increase index by a log(WFTU). (this is a guess)
 * 17) it will require UI change
 * 18) it will require its own ranking.

option 1 will likely be inefficient. To effectively index wiki code a (java) parser for wiki code would be required.< The requirements are a parser that can process and tag
 * templates
 * template parameters
 * magic words
 * parser functions
 * extensions
 * comments
 * nowiki
 * includeonly
 * noinclude


 * 1) I have been doing some work on writing a preprocessor but the work is far from over - it could be completed do this task.

Ranking & User Interface

 * if the source search feature will function as a stand alone aplication its ranking will need just a little tweeking.
 * if it is necessary to integrate it with general search it will require a more significant effort inolving.
 * specification.
 * design.
 * implimenation.

Specifying this behaviour
questions are users be able to search for or only for implemented ones?
 * 1) use case 1: regular user don't want to see templates in their search results
 * 2) use case 2: editors may be interested in searching - why
 * 3) use case 3: template dev and admin - may be interested in searching - why

multithreading
http://phplens.com/phpeverywhere/?q=node/view/254

missing pages
debugging page id of a missing main page

debugging page id of a missing catagory page

SQL schema
https://secure.wikimedia.org/wikipedia/mediawiki/wiki/File:MediaWiki_database_schema_1-17_%28r82044%29.png

=References=