Thread:Extension talk:Lucene-search/Index step slows to an unusable speed on full Wikipedia dump

Hi,

I am trying to run the 'build' step from the instructions on a full dump of the English Wikipedia site.

I find that it runs at a reasonable rate until what appears to be a spell-correction step. This starts at ~50,000 terms/second, but slows down, and I killed it at ~600 terms/second after about a week, and only about half way through at "mo...".

Are there configuration settings I should be changing to run the build step against such a big corpus?

Thanks, Barry