User:DanielRenfro/Biological Wikis

Abstract

 * Biological wikis have are here to stay.
 * However, there has yet to be an in-depth investigation of the advantages and disadvantages of the wiki methodology.

Introduction

 * The inherent complexity and ever-increasing production rate for biological data makes the task of structured storage difficult. The canonical relational model devised by Edgar Codd in1970 mandates an understanding of the types of data and their relationships prior to data input [1]. This type of model is inadequately suited for information that constantly changes. Despite this, relational databases have become the main type of data storage mechanism in biology, examples of which include Chado [2], ACeDB [3, 4], PathwayTools [5], and ArkDB [6]. In the past few years there has been an increasing interest in wiki based biological databases. Numerous so-called "bio-wikis" have appeared on the internet [refs!], in the literature [7-10] and have even led to meetings specifically about the topic [11]. Based on the idea of distributed and collaborative efforts by the community the wiki approach has gained popularity mainly in the area of biocuration [12, 13].
 * Some have integrated wikis into their projects with varying success
 * Biocuration, annotation
 * Health Information Management [14-16]
 * Some have called for changes to the existing archival-databases, this might be too much.
 * There are qualities that wikis offer, along with difficulties to overcome.
 * Resistance to updating GenBank. [17]
 * Quality is as good as a referential encyclopedia. [18]
 * Wikis are complex systems built upon collaboration between many individuals and have been likened to a self-organizing systems [19]. For any emergent system, self-organization relies on multiple independent entities interacting with limited local knowledge. In any wiki a critical number of users is necessary for the wiki to become self-sustaining via stigmergy...
 * One reason why the wiki pragma does not work well in academia could be that wikis are self-organizing systems. A successful wiki is an emergent entity that is not governed by any overarching blueprint or master engineer. Small wikis, in addition to having to overcome the user-base size problem, struggle to grow within their defined boundaries. (Is it correct to make a page about ___ on this wiki?) Wikipedia avoids this problem by being all-encompassing. Whatever the wiki becomes cannot be dictated by a single member – the wiki is dependent on the limited contributions of individual members changing their local environment (a set of pages that interest them.) Mark Elliot borrows the term "stigmergy" from the study of eusocial ants to describe Wikipedia's self-organizing nature [20, 21].

Benefits of Wiki

 * 1) features that don't have to be customly written
 * 2) edit history, auditing, undo/rollback
 * 3) user/group permissions
 * 4) not at the page level
 * 5) revision comparison
 * 6) create pages/content on-the-fly
 * 7) ease of use
 * 8) no prior knowledge required
 * 9) scalable to users of all levels of knowledge (of wiki markup)
 * 10) more recently WYSIWYG editor
 * 11) ability to capture "notebook-level" (narrative) information
 * 12) experts can add knowledge of their field

Considerations
The popularity of the online encyclopedia Wikipedia has drawn many biologists and bioinformatics researchers into the wiki arena. What makes the wiki model work for Wikipedia may be it's largest drawback for biology – a large community of users. Most biological wikis appear to suffer most from a lack of community participation.

The Gene Wiki has overcome this issue by teaming up with Wikipedia in order to leverage the large user base to annotate gene and protein function [12, 22]. The partnership solves problems such as web visibility at a price – Wikipedia is idiosyncratic about the types of articles that can be made. A certain level of 'noteworthiness' is required for an article, which excludes the majority of biological entities [23]. In addition to which Wikipedia maintains a strict policy against original research [24]. This means all information in Wikipedia must be previously published and citable.

Specific to Wiki

 * 1) wiki's strength is in capturing narrative data, not tabular or ordered data
 * 2) lack of data-consistency checking
 * 3) highly manual
 * 4) custom software still needs to be written
 * 5) software in academia is horrible [ref:]
 * 6) smaller community
 * 7) Mediawiki (most popular wiki software) written in PHP (not much biological software initiatives in PHP [ref:bioperl, bioJava, etc.])

In general

 * 1) software in academia lacks robustness
 * 2) little to no documentation
 * 3) written for a specific machine architecture, language, library, etc.
 * 4) unobtainable source
 * 5) not much invested in collaboration in academia
 * 6) lack of willingness
 * 7) * inherent to the system of publish-or-perish based on competition
 * 8) * lack of knowledge of said technology
 * 9) * lack of time
 * 10) * lack of funding
 * 11) few standards in web connectivity between databases
 * 12) combined with #2 above = not implemented
 * 13) collaboration incentives lacking
 * 14) no immediate rewards for contribution
 * 15) tenuous long-term rewards
 * 16) biological data seems to be subcritical in it's atomic state (the pieces alone tell you little), only in higher-ordered derivations does it become apparently important
 * 17) established publication system is resistant to recognizing or validating wikis as an accepted source
 * 18) * due to problems associated with anonymous editing (& Wikipedia)

Discussion and Future Directions

 * community participation is the most-often cited rate-limiting factor
 * structure of the wiki (standards need to be decided upon)

Resources

 * Biological wikis wiki – http://biodatabase.org/index.php/Main_Page
 * Google group about bio-wikis
 * Meetings
 * NAR database issue