Jump to content

Extension:CirrusSearch

本頁使用了標題或全文手工轉換
From mediawiki.org
This page is a translated version of the page Extension:CirrusSearch and the translation is 65% complete.
MediaWiki扩展手册
CirrusSearch
发行状态: 稳定版
实现 搜索, API , 函数钩
描述 借助 Elasticsearch 为 MediaWiki 增强搜索功能。
作者 Nik Everett, Chad Horohoe, Erik Bernhardson
最新版本 持续更新
兼容性政策 快照跟随MediaWiki发布。 master分支不向后兼容。
Composer mediawiki/cirrussearch
  • $wgCirrusSearchDeduplicateInQuery
  • $wgCirrusSearchLanguageWeight
  • $wgCirrusSearchAutomationCIDRs
  • $wgCirrusSearchUseIcuFolding
  • $wgCirrusSearchStemmedWeight
  • $wgCirrusSearchQueryStringMaxDeterminizedStates
  • $wgCirrusSearchCrossClusterSearch
  • $wgCirrusSearchExtraIndexSettings
  • $wgCirrusSearchAutomationUserAgentRegex
  • $wgCirrusSearchTalkNamespaceWeight
  • $wgCirrusSearchPrefixWeights
  • $wgCirrusSearchMustTrackTotalHits
  • $wgCirrusSearchPrefixSearchRescoreProfile
  • $wgCirrusSearchLanguageToWikiMap
  • $wgCirrusSearchCompletionSuggesterUseDefaultSort
  • $wgCirrusSearchExtraFieldsInSearchResults
  • $wgCirrusSearchMoreLikeThisMaxQueryTermsLimit
  • $wgCirrusSearchUseIcuTokenizer
  • $wgCirrusSearchCompletionBannedPageIds
  • $wgCirrusSearchOptimizeIndexForExperimentalHighlighter
  • $wgCirrusSearchRescoreProfiles
  • $wgCirrusSearchPhraseRescoreBoost
  • $wgCirrusSearchInterwikiProv
  • $wgCirrusSearchMoreLikeThisAllowedFields
  • $wgCirrusSearchQueryStringMaxWildcards
  • $wgCirrusSearchElasticQuirks
  • $wgCirrusSearchMaxFileTextLength
  • $wgCirrusSearchFallbackProfiles
  • $wgCirrusSearchMoreLikeThisTTL
  • $wgCirrusSearchAllowLeadingWildcard
  • $wgCirrusSearchInterwikiPrefixOverrides
  • $wgCirrusSearchMaintenanceTimeout
  • $wgCirrusSearchReplicas
  • $wgCirrusSearchPhraseSlop
  • $wgCirrusSearchBoostOpening
  • $wgCirrusSearchWriteBackoffExponent
  • $wgCirrusSearchUserTesting
  • $wgCirrusSearchDefaultNamespaceWeight
  • $wgCirrusSearchUseCompletionSuggester
  • $wgCirrusSearchPhraseSuggestReverseField
  • $wgCirrusSearchFallbackProfile
  • $wgCirrusSearchFragmentSize
  • $wgCirrusSearchUnlinkedArticlesToUpdate
  • $wgCirrusSearchCustomPageFields
  • $wgCirrusSearchClientSideUpdateTimeout
  • $wgCirrusSearchIgnoreOnWikiBoostTemplates
  • $wgCirrusSearchRegexMaxDeterminizedStates
  • $wgCirrusSearchInterwikiHTTPConnectTimeout
  • $wgCirrusSearchExtraIndexes
  • $wgCirrusSearchCategoryDepth
  • $wgCirrusSearchMergeSettings
  • $wgCirrusSearchClusters
  • $wgCirrusSearchCrossProjectShowMultimedia
  • $wgCirrusSearchBannedPlugins
  • $wgCirrusSearchMoreLikeThisConfig
  • $wgCirrusSearchClusterOverrides
  • $wgCirrusSearchAlternateIndices
  • $wgCirrusSearchCrossProjectBlockScorerProfiles
  • $wgCirrusSearchEnableIncomingLinkCounting
  • $wgCirrusSearchNearMatchWeight
  • $wgCirrusSearchReplicaGroup
  • $wgCirrusSearchFeedbackLink
  • $wgCirrusSearchTextcatConfig
  • $wgCirrusSearchNumCrossProjectSearchResults
  • $wgCirrusSearchLanguageDetectors
  • $wgCirrusSearchUpdateShardTimeout
  • $wgCirrusSearchEnableCrossProjectSearch
  • $wgCirrusSearchSecondTryProfiles
  • $wgCirrusSearchFullTextQueryBuilderProfiles
  • $wgCirrusSearchCompletionDefaultScore
  • $wgCirrusSearchWriteClusters
  • $wgCirrusSearchCompletionSuggesterHardLimit
  • $wgCirrusSearchRecycleCompletionSuggesterIndex
  • $wgCirrusSearchFullTextQueryBuilderProfile
  • $wgCirrusSearchTextcatModel
  • $wgCirrusSearchCompletionUseSecondTryProfile
  • $wgCirrusSearchStreamingUpdaterUsername
  • $wgCirrusSearchLogElasticRequests
  • $wgCirrusSearchConnectionAttempts
  • $wgCirrusSearchCompletionSuggesterUseAltIndexId
  • $wgCirrusSearchWikiToNameMap
  • $wgCirrusSearchMaxFullTextQueryLength
  • $wgCirrusSearchLogElasticRequestsSecret
  • $wgCirrusSearchManagedClusters
  • $wgCirrusLanguageLanguageKeywordExtraFields
  • $wgCirrusSearchCompletionSettings
  • $wgCirrusSearchNaturalTitleSort
  • $wgCirrusSearchDeduplicateInMemory
  • $wgCirrusSearchUseEventBusBridge
  • $wgCirrusSearchEnableRegex
  • $wgCirrusSearchClientSideSearchTimeout
  • $wgCirrusSearchIndexFieldsToCleanup
  • $wgCirrusSearchDeduplicateAnalysis
  • $wgCirrusSearchCategoryMax
  • $wgCirrusSearchExtraBackendLatency
  • $wgCirrusSearchNamespaceMappings
  • $wgCirrusSearchNamespaceResolutionMethod
  • $wgCirrusSearchPreferRecentUnspecifiedDecayPortion
  • $wgCirrusSearchIndexWeightedTagsPrefixMap
  • $wgCirrusSearchDocumentSizeLimiterProfiles
  • $wgCirrusSearchSearchShardTimeout
  • $wgCirrusSearchWeightedTags
  • $wgCirrusSearchRefreshInterval
  • $wgCirrusSearchSimilarityProfiles
  • $wgCirrusSearchCategoryEndpoint
  • $wgCirrusSearchMasterTimeout
  • $wgCirrusSearchPoolCounterKey
  • $wgCirrusSearchCompletionProfiles
  • $wgCirrusSearchMaxShardsPerNode
  • $wgCirrusSearchPrivateClusters
  • $wgCirrusSearchEnableArchive
  • $wgCirrusSearchUpdateDelay
  • $wgCirrusSearchInterwikiThreshold
  • $wgCirrusSearchIndexDeletes
  • $wgCirrusSearchDocumentSizeLimiterProfile
  • $wgCirrusSearchFiletypeAliases
  • $wgCirrusSearchDevelOptions
  • $wgCirrusSearchPrefixSearchStartsWithAnyWord
  • $wgCirrusSearchUpdateConflictRetryCount
  • $wgCirrusSearchInterwikiHTTPTimeout
  • $wgCirrusSearchFetchConfigFromApi
  • $wgCirrusSearchBoostTemplates
  • $wgCirrusSearchExtraIndexBoostTemplates
  • $wgCirrusSearchCompletionSuggesterSubphrases
  • $wgCirrusSearchPrefixIds
  • $wgCirrusSearchIndexedRedirects
  • $wgCirrusSearchMoreLikeThisFields
  • $wgCirrusSearchIndexAllocation
  • $wgCirrusSearchDefaultSemanticProfile
  • $wgCirrusSearchSanityCheck
  • $wgCirrusSearchStripQuestionMarks
  • $wgCirrusSearchNamespaceWeights
  • $wgCirrusSearchCrossProjectOrder
  • $wgCirrusSearchPhraseSuggestBuildVariant
  • $wgCirrusSearchIndexBaseName
  • $wgCirrusSearchMoreAccurateScoringMode
  • $wgCirrusSearchMaxPhraseTokens
  • $wgCirrusSearchCrossProjectSearchBlockList
  • $wgCirrusSearchPhraseSuggestUseOpeningText
  • $wgCirrusSearchCategoriesClientCacheTTL
  • $wgCirrusSearchMaxIncategoryOptions
  • $wgCirrusSearchEnableEventBusWeightedTags
  • $wgCirrusSearchWikimediaExtraPlugin
  • $wgCirrusSearchRescoreFunctionChains
  • $wgCirrusSearchLinkedArticlesToUpdate
  • $wgCirrusSearchRescoreProfile
  • $wgCirrusSearchPreferRecentDefaultHalfLife
  • $wgCirrusSearchDisableUpdate
  • $wgCirrusSearchFunctionRescoreWindowSize
  • $wgCirrusSearchActiveTest
  • $wgCirrusSearchPreferRecentDefaultDecayPortion
  • $wgCirrusSearchUseExperimentalHighlighter
  • $wgCirrusSearchCrossProjectProfiles
  • $wgCirrusSearchDefaultCluster
  • $wgCirrusSearchEnableAltLanguage
  • $wgCirrusSearchInterleaveConfig
  • $wgCirrusSearchPhraseRescoreWindowSize
  • $wgCirrusSearchSlowSearch
  • $wgCirrusSearchEnablePhraseSuggest
  • $wgCirrusSearchClientSideConnectTimeout
  • $wgCirrusSearchPhraseSuggestUseText
  • $wgCirrusSearchPhraseSuggestProfiles
  • $wgCirrusSearchSimilarityProfile
  • $wgCirrusSearchInterwikiSources
  • $wgCirrusSearchWeights
  • $wgCirrusSearchICUNormalizationUnicodeSetFilter
  • $wgCirrusSearchICUFoldingUnicodeSetFilter
  • $wgCirrusSearchShardCount
许可证 GNU General Public License 2.0 or later
下載
README
前往translatewiki.net翻譯CirrusSearch扩展
Vagrant角色 cirrussearch
問題 开启的任务 · 报告错误

CirrusSearch 扩展借助 Elasticsearch 为 MediaWiki 增强搜索功能。

Elasticsearch是一個獨立的第三方軟體,您必須安裝才能使用此擴充功能。 它是一個提供搜尋和索引功能的資料庫系統,您目前的wiki頁面的文字會被編入索引,以獲得更快和更好的搜尋結果。 MediaWiki與Elasticsearch之間的溝通是透過網路服務完成的。

另請參閱此擴充功能的使用說明網頁

目标

  • 去除使该扩展难以安装的本地相依关系。 仅有的相依關係是纯PHP、MediaWiki扩展、和Elasticsearch本身。
  • 为可由其他MediaWiki扩展去扩充的Wiki页面,提供一個接近实时的搜索索引。
  • 提供MWSearch 为用户提供的所有查询选项等。

依存组件

PHP 和 cURL

除了MediaWiki对php的标准要求之外,CirrusSearch还要求PHP编译时支持cURL

Elasticsearch or OpenSearch

Elasticsearch 的每個版本都會改變網路服務的運作方式,並導致相容性問題。 您必須安裝與您目前使用的 MediaWiki 版本相容的 Elasticsearch 版本:

MediaWiki 1.39+需要Elasticsearch 7.10.2 (6.8.23+使用compatibility layer 可能是可以的)。 有關與MediaWiki早期版本的相容性資訊,請參閱此版本

MediaWiki 1.44+ is compatible with OpenSearch 1.3.

6.8之前的Elasticsearch版本與PHP 8+不相容。

请注意,还需要像OpenJDK这样的Java安装。 最好使用官方的Elasticsearch Docker镜像或自己托管的版本。 Amazon OpenSearch(前身為 Amazon Elasticsearch)這類受管理的產品可以運作,但可能需要額外的組態,視其具體情況而定。 例如,Amazon OpenSearch 只在port 443 上侦听通过 HTTPS 发送的 Elasticsearch API 请求(即不暴露默认的 Elasticsearch port 9200),因此启用TLS 的代理(如 Nginx)可以使CirrusSearch与Amazon OpenSearch的集群之間通信。

Elastica
  • Elastica是一个可使得CirrusSearch向Elasticsearch對話的PHP库。按照下面的说明安装Elastica。

其他
  • 由于CirrusSearch扩展实际处理作业,建议在Redis中设置作业以防止Notice: unserialize(): Error at offset 64870 of 65535 bytes in JobQueueDB.php之类的消息和Unsupported operand types之类的后续错误。 请参阅T157759

安装

儘管下面的指示告訴你從git安裝時只需執行Composer即可,但可能仍是有必要發佈必需要安裝所有的PHP依存組件。

  • 下载文件,并解压Elastica文件夹到extensions/目录中。
    开发者和代码贡献人员应改从Git安装此扩展,输入:
    cd extensions/
    git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/Elastica
    
  • 只有是從Git安裝的才需运行Composer来安装PHP依赖,通过发行composer install --no-dev至扩展的目录。 (参见T173141了解潜在问题。)
  • 請新增下列代码到您的LocalSettings.php 文件的底部:
    wfLoadExtension( 'Elastica' );
    
  • Yes 完成 – 請导航至您的wiki上的Special:Version,以验证此扩展已成功安装。

CirrusSearch

  • 下载文件,并解压CirrusSearch文件夹到extensions/目录中。
    开发者和代码贡献人员应改从Git安装此扩展,输入:
    cd extensions/
    git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/CirrusSearch
    
  • 只有是從Git安裝的才需运行Composer来安装PHP依赖,通过发行composer install --no-dev至扩展的目录。 (参见T173141了解潜在问题。)
  • 請新增下列代码到您的LocalSettings.php 文件的底部:
    wfLoadExtension( 'CirrusSearch' );
    
  • 現在請依照CirrusSearch README中的設定指示來進行設定。請注意,其中的所有資訊可能不適用於您的擴充功能的版本,尤其是有支援Elasticsearch的版本。
  • 按照要求进行配置
  • Yes 完成 – 請导航至您的wiki上的Special:Version,以验证此扩展已成功安装。

啟用regex查詢

This is an optional step. You will need to install the search-extra plugin for this. Do so by following these steps:

  1. execute the following command:
    /usr/share/elasticsearch/bin/elasticsearch-plugin/elasticsearch-plugin install org.wikimedia.search:extra:7.10.2-wmf12
    
  2. add the following line to your LocalSettings.php file:
    $wgCirrusSearchWikimediaExtraPlugin[ 'regex' ] = [ 'build', 'use', 'max_inspect' => 10000 ];
    
  3. restart Elasticsearch with the following command:
    systemctl restart elasticsearch
    
  4. recreate the search index by executing the following commands:
    1. php path/to/extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --startOver
      
    2. php path/to/extensions/CirrusSearch/maintenance/ForceSearchIndex.php
      


升级

请遵循在CirrusSearch UPGRADE中的升级指南。

配置

CirrusSearch 的組態參數記錄在 "settings.txt" 檔案中。 另請參閱CirrusSearch組態概況的說明文件。

如果使用包含大寫字元的MySQL資料庫名稱,例如「MyWikiDatabaseName」,Elasticsearch 將無法為CirrusSearch建立索引。為了緩解這個問題,CirrusSearch提供了$wgCirrusSearchIndexBaseName組態參數,這需要某人去設定它,例如:$wgCirrusSearchIndexBaseName = 'mywikidatabasename';

钩子

CirrusSearch擴充功能定義了許多钩子,它是其他的擴充功能可以利用這些钩子擴充核心模式並修改文件。 以下钩子可用:

API

CirrusSearch 功能可用於 API 查詢。 Searching happens via the normal search API, action=query&list=search; you can use CirrusSearch-specific features, such as the morelike: special prefix to find pages related to Marie Curie and radium:

api.php?action=query&list=search&srsearch=morelike:Marie_Curie%7Cradium&srlimit=10&srprop=size&formatversion=2

Custom APIs and parameters are provided for querying CirrusSearch configuration and debug information:

參見

General links
调试

Local development

Elasticsearch service can be run with the Vagrant role (cirrussearch) and MediaWiki Vagrant.

For Docker, you can use a command like docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.2. Then follow the installation and configuration directions. If your web host is in a container, you'll want to make sure the above container is on the same network, and in the LocalSettings.php file, you will want to reference the elasticsearch as the hostname. This will not have the WMF plugins but can be sufficient for basic testing.