Improving Parsoid tracing & debugging tools

From mediawiki.org

Project: Improving Parsoid Tracing & Debugging Tools[edit]

This project is an effort to streamline and enhance the Parsoid tracing and debugging tools. Objectives include…:

  • …eliminating duplicated or otherwise redundant code, …
  • …streamlining command-line options to make them easier to use, …
  • …making tracing/debugging output more readable and informative, …
  • …adding tracing & debugging for transformations that currently aren't supported, …
  • …and using a logging library such as bunyan to log and monitor the performance of production code.

Name & Contact Information[edit]

Name: Maria Pacana

Email: maria.pacana@gmail.com

IRC or IM networks/handle(s): mariapacana

Location: New York, NY (through 1/9/14); San Francisco, CA (1/10/14 onwards)

Typical working hours: 10 AM - 6 PM (in NY); 8 AM - 4 PM (in CA)

Deliverables[edit]

(See Subramanya Sastry's original description here.)

Project timeline: December 10, 2013 - March 10, 2014 (13 weeks)

12/10/13 - 12/31/13 (2 weeks):

  • Get to know the existing codebase; reach out to mentors & the Parsoid community
  • Remove duplicated code and outdated tracing/debugging options
  • Update the Parsoid/Debugging page to make it more accessible to new contributors

1/1/14 - 1/21/14 (2 weeks):

  • Improve the readability and usefulness of tracing output
    • Propose new output templates that are better structured and easier to understand
    • Make sure that tracing output contains enough information to debug problems
    • Make it possible to disambiguate between tracing output for different instances of a class

1/22/14 - 2/15/14 (4 weeks):

  • Make tracing & debugging setup and configuration consistent across the codebase
  • Add trace output for transformations that are currently missing them, e.g. QuoteTransformer, which converts wikitext-quote tokens to HTML <i> and <b> tags)

2/16/14 - 3/3/14 (3 weeks):

  • Incorporate bunyan or a similar library into production-level logging and performance monitoring

3/4/14 - 4/10/14 (1 week):

  • Thoroughly test and wrap up previous contributions
  • Update Parsoid documentation

About Me[edit]

I am a current student at Hacker School. I have a degree in electrical engineering from Yale University (2003), as well as an MBA from McGill University (2008). Previously, I spent five years working as a financial analyst and another three years as a Japanese-to-English translator. Being a financial analyst taught me how to present information in a meaningful way and being a translator taught me attention to detail. I returned to engineering because I love making things.

This April, I quit my job to attend Dev Bootcamp, where I learned Ruby, Javascript, jQuery, Rails, and Sinatra. At Hacker School, I'm continuing to build web applications in Sinatra and Node while exploring topics such as data abstraction, algorithms, functional programming, and concurrency with fellow batchmates.

Participation[edit]

I plan to be on the #mediawiki-parsoid channel during my working hours, which will be between 8 AM - 4 PM PST. I will primarily be asking for help on #mediawiki-parsoid, but will also email my mentors or the wikitech-l list if I need additional help or no one is present on IRC. Every week, I'll touch base with my mentors via email to set goals and update them on my progress. I also plan to write a weekly blog post to update the community on my progress.

Availability[edit]

(From OPW Application): Will you have any other time commitments, such as school work, another job, planned vacation, etc., between December 10, 2013 and March 10, 2014?

Aside from federal holidays (Christmas, New Year’s, etc.), I have no other significant commitments.

Past Open Source Experience[edit]

My microtask for Parsoid has just been merged in. This patch improves the setting of tracing flags during testing. For example, selser (the “selective serializer”) is an additional enhancement to Parsoid’s existing html-to-wikitext serializer. When the “selser” flag is enabled during testing, the “html2wt” flag should automatically be enabled. Previously, developers were required to use both flags during testing, but my patch makes it possible to just use the “selser” flag.

I'm otherwise new to the open source community, although nearly all the code I've ever written is available on my Github profile. I believe in MediaWiki's mission of making accessible the sum of all human knowledge, and I think that Parsoid will be a vital tool for enabling more individuals to contribute to WikiMedia.