Flow/Functional Specifications/Search and Filter

From MediaWiki.org
Jump to: navigation, search

This document describes a set of functional requirements regarding local and site search for Flow.

This document should not be taken as a final descriptor for any one specific release of the software, though recommendations regarding inclusion in the "Minimum Viable Product".


The following nomenclature is used in this document to avoid confusion. This nomenclature is development-facing, not user-facing, and terms may change when visible to the user.

  • Board - a collection of Subscriptions. There may be only one Board per page. You can view any page's Board at Special:Flow/Some_page_name.
  • Flow-enabled - a wiki can specify which pages and namespaces should display a Flow board. For example, several projects' Talk pages, or all User_talk pages.
  • Header - content at the top of a Board (introductory text and such). There are/will be limitations on what can go in the header.
  • Subscription - the "connective tissue" between a Board and a Topic.
  • Topic - a "workflow instance". In the discussion space, this is a single discussion. This document concerns itself primarily with discussion workflows; however it should be noted that workflow instances can take many forms. Topics may be subscribed to by multiple Boards.
  • Summary - a Topic can be summarized. Other workflows might be more elaborate, e.g. "Closed. Answer #7 to this question was accepted by the originator on 2013-10-15".
  • Post - an atomic reply, comment, or object whose parent is a Topic.
    • Comment - synonymous with Post.
    • Reply - a child Post of another Post. In the October prototype, there is only one level of replies and these are called "Tangents".
    • Branch - a series of Posts that are all ultimately children of a single Post.
  • MVP - an acronym standing for "Minimum Viable Product".


Basics[edit | edit source]

Searching in Flow can be broken down into two basic types:

  • Board Search - these searches happen only on the contextual Board (or Feed). They do not search outside of that area.
  • Site Search - these searches happen across all Boards (but not Feeds, as Feeds are already conglomerations from multiple Boards).

Board Search and Filter[edit | edit source]

It is perhaps best to think of a Board search as applying a set of "filters" to the content. Board searching shall operate on the following:

  • Topic title
  • Topic tags
  • Post authors
  • Post content
  • Scratchpad content

Control Behavior[edit | edit source]

The search control bar shall dock itself at the top of the user's viewport (unless Javascript is disabled). It shall always be fully open.

Filter Keywords[edit | edit source]

By default, Board searches will run against all areas. However, the results can be filtered with finer grains by applying specific keywords:

  • Title - text searches title only
  • Author - only Post author user names
  • Tag - only Topic tags
  • Content - only the content of Posts
  • Scratchpad - only contents of scratchpads (term may change)

Filter keywords are applied cumulatively and are to be tokenized on split characters (",", ";") or new Filter keywords.

Token Behavior[edit | edit source]

By default, search tokens are to be split by whitespace except when:

  • Two or more tokens are surrounded by single or double quote characters
  • Two or more tokens are prefixed with a filter keyword (e.g., "Author:"), until the next lexical token break (, or ;) or the next filter keyword

Tokens are case-insensitive ("Foo" is the same as "foo" and "fOo").

Intersection[edit | edit source]

There are two general types of token intersection:

  • Cumulative intersection (equivalent to "OR" tokens) - all tokens are treated uniquely
  • Composite intersection (equivalent to "AND" tokens) - both tokens must have hits

By default, search intersection will be cumulative. That is, tokens will be treated as "OR" rather than "AND". Nested intersection (e.g., "This AND that OR (Foo AND Bar)") will not be supported.

Including composition intersection ("AND") will likely be beyond the scope of the MVP (in fact, writing a high-performance search system that allows nested intersections is likely beyond the scope of the Foundation's resources or desires).

Accordingly, we'll stay with cumulative intersection.

Stop Words[edit | edit source]

Each language will have to supply their own set of "stop words". Stop words will be removed from any search before it is performed (to reduce complexity). Common stop words (in English) include:

  • Articles (a, the)
  • Non-Specific pronouns (He, she, it, we, they)
  • Prepositions (in, on)
  • Non-used control tokens (or, and)

Search Examples[edit | edit source]

These are non-exhaustive examples, meant as illustration only.

Foo
returns all Topics where the text "Foo" is in the Topic title, a tag, an author name, or is included within a Post or scratchpad.
Foo Bar
returns all Topics where the text "Foo" or the text "Bar" is in the Topic title, a tag, an author name, or is included within a Post or scratchpad.
"Foo Bar"
returns all Topics where the exact text "Foo Bar" is in the Topic title, a tag, an author name, or is included within a Post or scratchpad.
Author:Jorm
returns all Topics in which User:Jorm has posted. Does not return Topics where Jorm has only been mentioned.
Author:Jorm Author:Werdna
returns all Topics in which either User:Jorm or User:Werdna has posted. Does not return Topics where either Jorm or Werdna have been mentioned.

Results Display[edit | edit source]

The display of search and filter results should be intelligent and its behavior should modify itself accordingly.

When the returned results are large (say, greater than 5 total Topics, or 20 total Posts), Topics should always be displayed as collapsed.

Search result entries should, when collapsed, indicate in some way what element was triggered in the filter.

TODO: mockup

Search Highlighting[edit | edit source]

In results, search terms should be highlighted. Typically this will be a yellow background field for the term, though the color may change depending upon the background color of the element.

Keyboard Navigation[edit | edit source]

When search filters are active, certain keyboard controls should become relevant:

  • N - jumps to next instance of a highlighted term
  • P - jumps to previous instance of a highlighted term

Site Search[edit | edit source]

Site searching within the Flow space will occur from the "master" search control. This search should operate upon:

  • Topic title
  • Topic tags
  • Post content
  • Post authors

Advanced Filters[edit | edit source]

Site search within Flow should allow for the following advanced options:

  • Return results from all wikis (true global search)
  • Return results in specific languages (Flow knows the language of the wiki a Topic is housed on)

Limitations[edit | edit source]

It is unlikely that the site search system will allow for complicated or natural language filtering unless we can configure the site search to "hook out" to the native Flow search.

Site search results will also likely be restricted to the local wiki (mostly for sanity's sake), though it is emminently possible to remove this constraint (indeed, it is likely more difficult to constrain the search to the local wiki only).

Site search results will likely have to display as closed links that lead to a "single Topic" Flow Board, since the site search "bed" is not Flow-enabled (something for the future?).

General Complexities[edit | edit source]

Since content is returned in a lazy-loaded, infinite-scroll format, the browser's built-in "Ctrl-F" search function will be broken or severely limited. See "Keyboard Navigation," above.