Phlogiston/Data Model

Jump to navigation Jump to search

Updated 27 Sep 2017. Incomplete but probably not wrong or out of date.


Ancestor: A task that is tagged in Phabricator with one of the ancestor-qualifying tags. This is a performance optimization, to avoid having to calculate parent-child for all tasks.

Category: A grouping of tasks within Phlogiston.

Phlogiston Scope: A set of tasks that are analyzed as a group, under the assumption that they are the same body of tasks that one team of people work on. A Phlogiston scope usually contains all tasks from multiple Phabricator projects. This word was chosen, instead of project, to avoid confusion with Phabricator projects.

Phabricator Project: A tag for grouping and labeling tasks in Phabricator.

Source: synonym to Phlogiston Project.

Status: The Phabricator status field.

Data Model during Phlogiston execution[edit]

The following tables are updated in this order:


These tables hold the Phabricator data imported from a dump file. These tables are wiped and replaced with each load. These tables are used for reconstruction. They are [maybe?] not referenced after reconstruction is complete. They do not reflect the concept of a "scope".

  1. phabricator_project
  2. phabricator_column
  3. maniphest_task
  4. maniphest_blocked_phid
  5. maniphest_transaction
  6. maniphest_blocked


These tables hold a historical reconstruction of Phabricator data back to the beginning of available data; they represent a denormalization of the transaction data into an easier-to-query data set. They are partitioned by scope, so that a wipe of one scope does not affect any other scope. These tables are very time-consuming to create, and so are typically updated incrementally with each nightly dump.

  1. task_on_date. Each row is one task for one day.
  2. category. Each row is one category.
  3. maniphest_edge. Each row is the membership of one task in one project. This table is not partitioned by scope (for optimization reasons).
  4. phab_parent_category_edge. Each row is the inclusion of one task in one ancestry.


These tables hold data necessary to generate reports. They are partitioned by scope, and wiped by partition at the beginning of each report.

  1. task_on_date_agg
  2. task_on_date_recategorized
  3. recently_closed
  4. recently_closed_task
  5. maintenance_week
  6. maintenance_delta
  7. velocity
  8. open_backlog_size