User:Language portal/Language coverage matrix/GSoC 2013
This is proposal for GSoC project.
Name: Harsh Kothari
Project title: Language Coverage Matrix dashboard
Timezone: UTC+5:30 (IST - India)
Typical working hours: 12:00 PM to 6 PM (IST) and 10:00 PM to 4:00 AM (IST)
IRC or IM networks/handle(s): harshkothari (freenode)
I live in Ahmedabad, a metropolitan city with 24/7 power supply and a good enough and uninterrupted Internet Connection. So working online will not be hampered by any means.
This web based dashboard will also help other products and communities for showing innumerous search results and visualization graphs for the same.
Document of Matrix Data
Bug on Bugzilla
Thread on Mailing List
- A Python Script - To save all current data into the MySql Database. Currently all the data is in spreadsheet. So I will create one Python Script that saves all the data of spreadsheet into database.
- A Form (HTML + jQuery) - To manually enter new data in the Database. If there will be any new entry it can be easily entered manually. This can be done by only admin.
- Form will contain one search field as well as a button
- If one want to update existing language will search into search field and after selecting a language one form automatically created with filled entry
- If one want to add new entry, click on the button named “Add new Language” and then one form will be automatically generated with all possible fields.
- Clicking on save button changes will save the entered data into database and will also update appropriate pages.
- This uses jQuery + HTML + Ajax
- PHP - For all the integration work. Since it is web based application all integration will be done by PHP. Session Management, Scripting for AJAX call and response, Database connection.
- Dashboard - DashBoard will fill up as per the query i.e filtering or searching process. Dashboard will be dynamic, smooth and very responsive. It will show the result with an attractive UI and efficient understanding. This will use jQuery + Ajax + PHP + MySQL.
- UI Designing - Professional look and feel Dashboard design with all mentioned facility. This will use CSS.
- Optimized search facility with autocomplete feature. User can search a region / language / language code etc that will be implemented by jQuery, Ajax with autocomplete facility. One more Search method I will try to implement and that is “Language processing search”. It is very useful search method and very fast to get data as per user’s need. Use Case is mentioned below.
- Search “How many languages support grammar rules” it will give you ans 44 or any number
- “what is jquery.ime” -> That will give definition of jQuery.ime
- “languages that support jquery.ime ” -> That will give the list of languages that supports jquery.ime and Dashboard will fill by appropriate data.
- This will be implemented by jQuery + PHP + AJAX
- Creation of bot - For automatic updation. Bot will run after certain schedule time. If any changes in spreadsheet and not updated into database then it will update database. it also have direct integration with Names.php, langdb.yaml in jquery.uls, extra languages supported in translatewiki.net and incubator, etc. This is very useful feature since all the details will updated successfully.
- MySQL Database - To save all the data. Tables for database
- Table 1 : Details of all languages (will also include detail of incubator languages by adding one more column)
- Table 2 : IME GAP details
- Table 3 : For admin details
- I will create / update tables as per the requirement.
- Create APIs to help some other websites, that are in need of data which the project includes. This will be of help for fetching data , for their projects, research, analysis and so on.
- Homepage -> All languages or regions.
- [ Clicking on ] Region -> All languages of that region
- [ Clicking on ] Language -> All the details of that particular language (example - key maps, web fonts, translation, language selector, i18n support for gender, plurals, grammar rules) + Visualization Graph of the language.
- Dropdown menu for Search / Checkbox for Filtering the Search
- Search field with autocomplete / suggestion facility.
- [ Select from ] Dropdown / [ Select from ] Checkbox -> Dashboard will be filled automatically as per the query. (i.e. - If user wishes to see the list of languages that have grammar rules - just a *click on ‘grammar rules’ will show all the languages that have grammar rules.)
- Error Handling Use Case : [ Select from ] Dropdown / [ Select from ] Checkbox -> If no result found for same query then it will show "No Result Found" and will show data before the select/filter process is done.
Some features that would be directly useful to Mediawiki developers and Wikmedia site maintainers
- Direct integration with existing lists of languages: Names.php, langdb.yaml in jquery.uls, extra languages supported in translatewiki.net and incubator, etc.
- Integration with a matrix of existing or planned Wikimedia projects, so it would be clear from the matrix - is there a project in this language? Are the language tools extensions installed in this project? Is there an incubator project in this language?
- Understanding variants: does this language supports variants in any way?
If time permits
- I would add integration with other knowledge bases about languages, such as Ethnologue, CLDR and others, that would provide information such as number of speakers, literacy levels, language contact, etc. This way it would be possible to see, in a way that is slightly more structured that what we have now, how well our projects are covering the different languages of the world.
- I will create one Mediawiki Extension for matrix support + visualization graph + filtering searching facility.
- I will create browser support matrix for TUX and other product. i.e https://bugzilla.wikimedia.org/show_bug.cgi?id=45602
|May 27- June 16||Planning and delving for some new ideas and technologies for the basic Implementation, creating database and python script to store current data into database|
|June 17- June 30||Basic dashboard integration with php, database connection and simple designing with login system and manual entry facility.|
|July 29- August 10||Better UI and designing exertion|
|August 24- August 31||All browser support and smooth searching as well as filtering. Improvement on visualization graph as well as UI and all other functionalities and integrate all the above mentioned features.|
|September 1- September 7||Create one bot to automate the task and updation of database on scheduled time.|
|September 8- September 15||Create APIs of LCM data so that it can be used outside the scope of this web tool.|
|September 16- September 23||Final touch and documentation|
I am Harsh Kothari, final year engineering student of L.D. College of Engineering. I am from Gujarati Wikipedia community, and also a contributor in Mediawiki for almost 8 months now. I have developed Mediawiki Extension : TwitterCards. I am a promoter of 1st Mediawiki group of India. I have localized and ported different gadgets in Gujarati Wikipedia as well as other indic wiki. i.e HotCat, Reference Tooltip, PopUps.
The proposed project is about Language Engineering Matrix dashboard which is an Internalization project. This will develop a web based dashboard that will include all the details of languages supported by Wikimedia. This project will be of great help for Language Engineering team of WikiMedia. This tool will help them to analyze the details of various features of individual language.
In my opinion, IRC is the best way of communication hence I am available on IRC all the time on the channels such as #mediawiki, #mediawiki-i18n, #wikimedia-dev, #wikimedia-lab. I am an active participant in the discussions on different mailing lists such as wikitech-I , mediawiki-india, mediawiki-i18n. I would appreciate all discussions related to my project to be carried on the above mentioned mailing lists and WikiPage.
I own a blog where i would update all the progress of my proposed project. I would also update all my progressive work regarding this project on Github. Also since my project aims at implementing new features, i would be taking regular feedback from the community over my Interface designs through testing and prototypes, and also through the mailing list.
Past Open Source Experience
I am involved with many open source activities in Ahmedabad. I am an active member of Google Developer Group Ahmedabad. I have created MediaWiki Extension TwitterCards. I am also a small contributor in Mediawiki Extension EtherEditor , jquery.uls and jquery.ime. My all code is open-source and is uploaded at Github. I have also worked on Library to get metadata from parsed raw description text. I was also invited as a delegate to share my knowledge on open source in various open source events. I was a speaker in the workshop of Mediawiki Gadget Kitchen held at Gnunifyand Avenir.
I really want to thank my mentors Runa Bhattacharjee and Alolita Sharma for guiding me through out and very special thanks to Amir Aharohi for his valuable inputs for this proposal. Last but not the list special thanks to Sumana and Quim for polishing my proposal as well as for valuable feedback.
LCM-dashboard Repo on Github
Any other info
- Auto complete search example - http://thegtuguide.com/harshkothari/autocomplete
- Project Experience -
I am working on Language Coverage Matrix Dashboard for language engineering team of wikimedia foundation.
- In the community bonding period I first thought/planned about new technologies which I can use in my project.
- After that I created the database schema.
- Currently all the data is in spreadsheet so created one python script so that it can transfer all the data from spreadsheet to database automatically.
- the week after, I started working on new language entry system, database connection and completed it.
- I recently completed language search system with suggestion.
- When one will search the language and select it then it will show all the data of the particular language.
- Currently working on facilitating the admin of changing the details of a particular language on the same page,if he/she wants
Link of my progress report : Project Updates
- created and set up primary thing on wikimedia labs
- done minor changes in language search system
- created on the spot editing facility for any language detail under admin privileges
- created filter facility
- New UI designs - contains
- Database scheme changed
- New Design implemented
- jquery based new alternative REST architecture created
- Language to Font mapping
- Language to Input method mapping
- redefining search implementation
- created login system for admin + session management
- designed new UI on langfilter.php page
- developed PI visualization chart
- developed API for language detail
- developed example for API usage
- created preview system for new language entry system as well as direct editing feature
- Developed API console for 3rd party user to use these data for their website.
- Redesigned entire NEW UI as per proposed scheme
- Small correction in functionality
- Solved different bugs
Link of my progress report : Project Updates
Project Progress : See here