Jump to content

Topic on User talk:Brooke Vibber/Compacting the revision table round 2

Halfak (WMF) (talkcontribs)

So, I saw that Brian VIBBER had mentioned elsewhere the idea of having the logging table reference the newly proposed comments table. It seems that this might strain the metaphors here a little bit and duplicate some fields. I wanted to throw out an alternative idea.

Currently, I we'd have something like this:

  • revision
    • rev_comment_id (comment.comment_id)
    • rev_user_entry (user_entry.ue_id)
    • rev_timestamp
    • ...
  • logging
    • log_comment_id (comment.comment_id)
    • log_user_entry (user_entry.ue_id)
    • rev_timestamp
    • ...
  • user_entry
    • ue_id
    • ...
  • comment
    • comment_id
    • ...

Alternatively, we could follow the like-with-like policy and have an event table that has the common attributes across revision and logging that looks like this:

  • event
    • event_id
    • event_type ("log" or "rev" or "new" or whatever)
    • event_comment (comment.comment_id)
    • event_user_entry (user_entry.ue_id)
    • event_timestamp
    • ...
  • revision
    • rev_event
    • ...
  • logging
    • log_event
    • ...
  • comment
    • comment_id
    • ...
  • user_entry
    • ue_id
Anomie (talkcontribs)

On the down side, that event table would be taller than revision. It looks like you'd save one timestamp, one tinyint (xxx_deleted), one int (xxx_page), and two bigints (xxx_comment and xxx_user_entry) in exchange for xxx_event (int or bigint?) out of revision and logging, and a possibility to reuse the event row when an edit creates both a log entry and a dummy revision.

Glancing at indexes, we might have to lose the indexes on logging that combine log_type and user and/or timestamp, which would make it hard to have a query fetch logs of a particular log_type ordered by timestamp (i.e. Special:Log) without filesorting.

Reply to "Event table?"