User:SBailey (WMF)

Role at Wikimedia Foundation: Hopes for Wikimedia platform futures: Areas of professional interest: Experience: Current hobbies: Pets: Residence:
 * Parsoid and extensions developer
 * help gather requirements, design and create a way to allow editors to author interactive models and simulations in the mediawiki platform
 * empowering non-programmers to create programs
 * programming language design
 * electronics and embedded systems
 * mechatronics, robotics and automation
 * agriculture automation systems for aquaponics and other farming techniques
 * cross-platform, distributed computing, functional array programming language and platform
 * spreadsheet cell engine internals developer (Lotus 123 and Borland Quattro Pro)
 * math education platform and geometry manipulation learning environment for IBM
 * X86 CPU test generating and cosimulation control C interpreter language for Transmeta
 * graphics primitive libraries
 * embedded systems development of hardware, firmware and cloud control processes
 * Farm and garden construction and operation including automation
 * Alternative energy systems including photo voltaic solar, solar thermal capture, ground source heat storage and battery inverter systems
 * various PC and playstation video games
 * Chocolate Labrador Retriever (Bodhi) - born 4/2018 - adopted 6/2018 from AKC breeder
 * Staffordshire Terrier/Pit Bull - born 11/2021 - rescued 6/2022 from Rescue Ranch
 * modest home on 40 acres in Fort Jones California, 26 acres of which is currently a wheat field, the rest sparse oaks on sloping hill side

--

Important considerations for wikimedia development:

For wikimedia maintenance scripts which need to operate on whole tables, the use of batching is essential and a recommended method is the keyset pagination method, not the LIMIT OFFSET method.

The LIMIT OFFSET approach slows query down when a size of the data is very large because it requires the database read through the records until it reaches the offset #, which cannot use an index or just skip ahead a certain number of records as deleted records will not count as part of the offset. The approach which can use an index to aid in batching efficiently is called Keyset pagination. It requires a unique id in your query, which you can use as a bookmark to point to the last row of the previous page. The next page is fetched using the last bookmark which uses the indexed key field as the start point. For instance: Which when rendered using the wikimedia database abstraction class is: If the result set above returns the last row with linter_id as 12345, you can use it to fetch the next page which will then use the $lastElement value of 12345: This code section will live inside an enclosing loop to process each batch and probably sleep a second between batches and call $lbFactory->waitForReplication; to allow the table updates to be propagated to the database replicas.