User talk:Legoktm/GSoC 2013

There's one part of this project that maybe wasn't clear in the project description and so isn't addressed in your plan. The format of the *fulls* would have to be changed to, and a script to write the old xml format from the new fulls should be written. The reason for this is that we already do get (mostly) only new information from the database servers, but the time consumed in reading the old xml files, uncompressing them, doing integrity checks to make sure it's probably good content, and copying it to the new full takes several days for en wp even with 27 processes running to do it. The current format needs to die a horrible death and be replaced with something better. Can you incorporate this in your proposal? -- ArielGlenn (talk) 07:28, 24 April 2013 (UTC)