Extension:UniversalLanguageSelector/Fonts for Chinese wikis

UniversalLanguageSelector Fonts for Chinese wikis

 * Bugzilla report: Bug 31791
 * Announcement: Proposal announcement at the wikitech-l mailing list.

Name and contact information

 * Name: Aaron Xiao
 * Email: xiaoxiangquan@gmail.com
 * IRC or IM networks/handle(s): aaron_xiao
 * Location: Beijing, China
 * Typical working hours: (UTC+8:00) waking hours are 9:00 AM to 23:00 PM, typical working hours are 10:00 AM to 18:00 PM.

Synopsis
Chinese uses more then 80000 characters, and 70217 are included in Unicode 5.0. However, only 3500 of them are used in our daily life. Most of the rarely used characters are not often installed on readers' systems. Even us Chinese use GBK font heavily, which contains about only 20000 characters. So we are sure to meet tofu problems, and webfonts service is triggered.

However, including all characters in the font file makes it huge. We may want to tailor the font file for every page based on characters used on that page. Once finished, this feature can be applied to other languages facing the same problem, such as Japanese.

As of writing, there isn't any "good" enough free font which includes all Chinese characters in Unicode. And the "wiki" concept itself encourages collaborative content creation, so it would be nice to invite user to create a glyph for it when the system sees a character without existing data.


 * Possible mentors: User:Liangent User:DChan_(WMF)

Deliverables
Tailor the font file according to the characters used in every page. During the SoC event, maybe I can only finish the Chinese Tailor as an experiment. If it works well, we can extend it to other languages or even become a universal feature.
 * Chinese Font Tailor

When tofu occurs, encourage the user to contribute the missing glyph.
 * Glyph Collector

Design docs, development docs, and user docs, anything we think it useful.
 * Documents


 * Long Term Support [after the event]

Firstly, I'll surely push the work forward to be released finally, just like what I did before for other open source projects.

I love i18n projects. So I'd like to go along with the topic. e.g. More than pleasure to be a mentor of GSoC in the future :)

Knowledge Preparation
Frankly speaking, I've never touched the technical part of mediawiki before. So it takes some time to get my hands dirty. I have started reading the docs about ULS and HOTOs for developers. And then All these will be done before Coding Start Date.
 * Get to the details of the implementation
 * Keep on discussing with mentors and the community
 * Finally give more accurate approaches and schedule

Building the Workflow
I'm about to graduate on around 1st, July. I have to spend some time on my graduation affairs. So during the first half of the event, less work will be done. I can build the developer environment and do more documentation, including designing and implementation facts. It is also a great time for us to plan more. In a word, build the workflow so that I can focus on the feature afterwards.
 * Build mediawiki developer environment
 * Find proper fonts to use, such as: (Note that we only deal with Chinese firstly)
 * Free Chinese Fonts suggested by Ubuntu CN
 * WenQuanYi for Chinese
 * Hanazono for Japanese
 * efont for Japanese
 * unfonts for Korea
 * Find proper creator for users to contribute glyph
 * Documentation on designing and implementation
 * Do some experiment coding, of course

Chinese Font Tailor
Finish the tailor, integrate it to the ULS. People can try the function presented previously. As I can work as full-time since July, this will be done before August.

Glyph Collector
I think it a simple feature, without much technical barriers. Only one or two weeks are assigned. So I can finish the main features before Pencil Down Date.

Follow-up (Maybe after the event)
More testing, beta release, fixing reported bugs. Make it stable and more efficient, then push it forward to be merged into trunk finally.

About you
M.S. in Computer Science, Peking University, Beijing, China
 * Education completed or in progress:

I searched in the organizations list with keyword "i18n", as it is one of my working fields. I'm always trying to introduce great open source projects to Chinese, as well as other non-English users.
 * How did you hear about this program?

Before June(included), I must spend some time on my thesis and graduation affairs. And then I can work as full time in July, August and September. ( I know the event schedule. I'll finish the main features on time. And then continue working to push it forward to be released finally, even after the GSoC event, of course. )
 * Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?

As male, Only SoC.
 * We advise all candidates eligible to Google Summer of Code and FOSS Outreach Program for Women to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

Past experience

 * Please describe your experience with any other FOSS projects as a user and as a contributor:


 * Please describe any relevant projects that you have worked on previously and what knowledge you gained from working on them (include links):

I'm interested in i18n, game-development and mobile-app projects, but this year I only applied for this project. I'd like to do other i18n related projects for mediawiki if the feature in this proposal has low priority.
 * What project(s) are you interested in (these can be in the same or different organizations)?