MediaWiki Extension CirrusSearch and Elasticsearch Setup: Difference between revisions

From WikiMLT
Spas (talk | contribs)
Spas (talk | contribs)
 
(42 intermediate revisions by the same user not shown)
Line 1: Line 1:
<noinclude><!--[[Category:DevOps_and_SRE|?]]-->{{ContentArticleHeader/MediaWiki|toc=off}}{{ContentArticleHeader/DevOps_and_SRE|toc=off}}{{ContentArticleHeader/Linux_Server}}</noinclude>
<noinclude><!--[[Category:DevOps_and_SRE|?]]-->{{ContentArticleHeader/MediaWiki|toc=off}}{{ContentArticleHeader/DevOps_and_SRE|toc=off}}{{ContentArticleHeader/Linux_Server}}</noinclude>
This is a short manual how to set-up [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/important-settings.html Elasticsearch] to be used with the MediaWiki's extension [[mw:Extension:CirrusSearch|CirrusSearch]]. You should choice an appropriate Elasticsearch version [[mw:Extension:CirrusSearch#Elasticsearch|depending]] on your MediaWiki version. Currently I'm using MediaWiki 1.38 and it is recommended to use Elasticsearch 6.8.23+ with it. This version runs  well over <code>openjdk-11</code> which is the default Java version on Ubuntu Server 22.04.
This is a short manual how to set-up [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/important-settings.html Elasticsearch] to be used with the '''MediaWiki's extension [[mw:Extension:CirrusSearch|CirrusSearch]]''' which communicate to the service by the extension [[mw:Extension:Elastica|Elastica]]. You should choice an appropriate Elasticsearch version [[mw:Extension:CirrusSearch#Elasticsearch|depending]] on your MediaWiki version. Currently I'm using MediaWiki 1.39 and it is recommended to use [https://www.elastic.co/downloads/past-releases/elasticsearch-7-10-2 Elasticsearch 7.10.2] with it. This version runs  well over <code>openjdk-11</code> which is the default Java version on Ubuntu Server 22.04.


== Setup Java and Javac ==
Elasticsearch and the extension Elastica are required by some other MediaWiki extensions as extension [[mw:Help:Extension:Translate/Translation memories#ElasticSearch%20backend|Translate]] where it is used as translation memory. It is also used by the NextCoud's application [https://github.com/nextcloud/fulltextsearch/wiki Full text search] and more...
 
'''''See also [[MediaWiki Job Queue]].'''''
 
== Java Setup ==
On Ubuntu Server the default <code>jdk</code> and <code>jre</code> packages can be installed by the following command.<syntaxhighlight lang="shell" line="1">
On Ubuntu Server the default <code>jdk</code> and <code>jre</code> packages can be installed by the following command.<syntaxhighlight lang="shell" line="1">
sudo apt install -y apt-transport-https default-jdk default-jre
sudo apt install -y apt-transport-https default-jdk default-jre
Line 54: Line 58:
{{collapse/end}}
{{collapse/end}}


== Elasticsearch ==
== Elasticsearch Setup ==


There is a couple of ways how to [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/important-settings.html Installing Elasticsearch] - via Docker, via Apt repository, via .deb or .rpm packages, etc. I prefer to manually download and install it via .deb package. Is I said before for MediaWiki 1.38 we need version 6.8.23+.
=== Installation ===
There is a couple of ways how to [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/important-settings.html Installing Elasticsearch] - via Docker, via Apt repository, via .deb or .rpm packages, etc. I prefer to manually download and install it via .deb package. As we said before for MediaWiki 1.39+ we need Elasticsearch version 7.10.2.
{{collapse/begin}}
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
cd ~/Downloads
cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-amd64.deb
sudo apt install ./elasticsearch-6.8.23.deb
sudo apt install ./elasticsearch-7.10.2-amd64.deb
</syntaxhighlight>
</syntaxhighlight>
{{collapse/div|#Versions}}
{{collapse/div|#Versions}}
<syntaxhighlight lang="shell" line="1" class="mlw-shell-gray">
<syntaxhighlight lang="shell" line="1" class="mlw-shell-gray mlw-collapsed-first-element">
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
</syntaxhighlight>
</syntaxhighlight>
{{collapse/end}}
{{collapse/end}}
<code>ElasicSearch 6.5.4</code>, който е инсталиран работи с <code>openjdk-11</code>, която е версията по подразбиране в Ubuntu 20.04. При условие, че възникнат проблеми със стабилността, ще се наложи преминаване към <code>ElasicSearch 5.6.16</code>, която работи с <code>openjdk-8</code>. За подробности: [[Mw:Topic:Vo4jribwur9xzm9z|'''виж''' '''тук''']].
After installing the package the Elasticsearch service must be enabled and started.<syntaxhighlight lang="shell" line="1">
sudo systemctl enable --now elasticsearch.service  # enable and start the service
systemctl status elasticsearch.service              # check the status of the service
systemctl cat elasticsearch.service                # check the current service's configuration
</syntaxhighlight>
 
=== Check ===
You can check does the service work properly by the following approach.
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
sudo apt install ./elasticsearch-6.5.4.deb
</syntaxhighlight><syntaxhighlight lang="bash">
# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
</syntaxhighlight><syntaxhighlight lang="shell" line="1">
sudo systemctl enable elasticsearch.service
sudo systemctl start elasticsearch.service
sudo systemctl status elasticsearch.service
</syntaxhighlight>Намаляване на разрешената памет. Дори и тази ми се вижда много в [https://bg.trivictoria.org bg.trivictoria.org] e <code>128m</code>. Ако има много едновременни заявки може да надхвърли наличната памет и да се счупи. От друга срана ако е много малко пак се чупи.<syntaxhighlight lang="shell" line="1">
sudo nano /etc/elasticsearch/jvm.options
</syntaxhighlight><syntaxhighlight lang="bash">
#-Xms1g
#-Xmx1g
-Xms512m
-Xmx512m
</syntaxhighlight>Добавяне на директиви за автоматично рестартиране в system.d unit-а.<syntaxhighlight lang="shell" line="1">
sudo cp /lib/systemd/system/elasticsearch.service ~/Downloads/elasticsearch.service.default
sudo nano /lib/systemd/system/elasticsearch.service
</syntaxhighlight><syntaxhighlight lang="bash">
[Service]
# В края на секцията
Restart=always
RestartSec=3
</syntaxhighlight>Прилагане на промените.<syntaxhighlight lang="shell" line="1">
sudo systemctl daemon-reload
sudo systemctl restart elasticsearch.service
sudo systemctl status elasticsearch.service
</syntaxhighlight>Проверка.<syntaxhighlight lang="shell" line="1">
curl 'http://127.0.0.1:9200'
curl 'http://127.0.0.1:9200'
</syntaxhighlight><syntaxhighlight lang="json">
</syntaxhighlight>
{{collapse/div|#Output}}
<syntaxhighlight lang="json" class="mlw-collapsed-first-element">
{
{
   "name" : "HFrziWt",
   "name" : "metalevel.tech",
   "cluster_name" : "elasticsearch",
   "cluster_name" : "elasticsearch",
   "cluster_uuid" : "5qMVv8CHT3q2vv1sd8hLOw",
   "cluster_uuid" : "znG-mCHAQU6L3oVR9UIthg",
   "version" : {
   "version" : {
     "number" : "6.5.4",
     "number" : "7.10.2",
     "build_flavor" : "default",
     "build_flavor" : "default",
     "build_type" : "deb",
     "build_type" : "deb",
     "build_hash" : "d2ef93d",
     "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
     "build_date" : "2018-12-17T21:17:40.758843Z",
     "build_date" : "2021-01-13T00:42:12.435326Z",
     "build_snapshot" : false,
     "build_snapshot" : false,
     "lucene_version" : "7.5.0",
     "lucene_version" : "8.7.0",
     "minimum_wire_compatibility_version" : "5.6.0",
     "minimum_wire_compatibility_version" : "6.8.0",
     "minimum_index_compatibility_version" : "5.0.0"
     "minimum_index_compatibility_version" : "6.0.0-beta1"
   },
   },
   "tagline" : "You Know, for Search"
   "tagline" : "You Know, for Search"
}
}
</syntaxhighlight>За да започне регулярно индексиране на съдържанието на уикито, спрямо конфигурацията, направена в <code>/var/&shy;www/&shy;*/&shy;Local&shy;Sett&shy;ings.php</code> и документацията на [[Mw:Extension:CirrusSearch|mw:Extension&shy;:&shy;CirrusSearch]] трябва да направи първоначална индексация, да се изпълнят задачите, които ще създаде тя, да се регенерира индекса на съдържанието и отново да се изпразни опашката със задачите. За целта могат да се използват скриптовете за поддръжка, описани в секцията MediaWiki.<syntaxhighlight lang="shell" line="1">
</syntaxhighlight>
mw-maintenance-elasticsearch-index.sh
{{collapse/end}}
mw-maintenance-runJobs.sh cli
 
mw-maintenance-rebuildAll.sh
More detailed information can be obtained by the next command.
mw-maintenance-runJobs.sh cli
 
</syntaxhighlight>В допълнение е разработен скрипта <code>elasticsearch-watch.sh</code>, като чрез <code>crontab</code> задача се прави периодична проверка и при необходимост рестартиране. Скрипта изпраща писмо до <code>vectoria@altclavis.com</code>, ако настъпи събитие.<syntaxhighlight lang="shell" line="1">
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1">
curl -XGET 'http://localhost:9200/_nodes?pretty'
</syntaxhighlight>
{{collapse/div|#Output}}
<syntaxhighlight lang="json" class="mlw-collapsed-first-element mlw-pre-max-height-320">
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "BNfFzNWMTF20Xd5nlcwt6w" : {
      "name" : "metalevel.tech",
      "transport_address" : "127.0.0.1:9300",
      "host" : "127.0.0.1",
      "ip" : "127.0.0.1",
      "version" : "7.10.2",
      "build_flavor" : "default",
      "build_type" : "deb",
      "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
      "total_indexing_buffer" : 214748364,
      "roles" : [
        "data",
        "data_cold",
        "data_content",
        "data_hot",
        "data_warm",
        "ingest",
        "master",
        "ml",
        "remote_cluster_client",
        "transform"
      ],
      "attributes" : {
        "ml.machine_memory" : "25112887296",
        "xpack.installed" : "true",
        "transform.node" : "true",
        "ml.max_open_jobs" : "20"
      },
      "settings" : {
        "client" : {
          "type" : "node"
        },
        "cluster" : {
          "name" : "elasticsearch",
          "election" : {
            "strategy" : "supports_voting_only"
          }
        },
        "http" : {
          "type" : "security4",
          "type.default" : "netty4"
        },
        "node" : {
          "attr" : {
            "transform" : {
              "node" : "true"
            },
            "xpack" : {
              "installed" : "true"
            },
            "ml" : {
              "machine_memory" : "25112887296",
              "max_open_jobs" : "20"
            }
          },
          "name" : "metalevel.tech",
          "pidfile" : "/var/run/elasticsearch/elasticsearch.pid"
        },
        "path" : {
          "data" : [
            "/var/lib/elasticsearch"
          ],
          "logs" : "/var/log/elasticsearch",
          "home" : "/usr/share/elasticsearch"
        },
        "transport" : {
          "type" : "security4",
          "features" : {
            "x-pack" : "true"
          },
          "type.default" : "netty4"
        }
      },
      "os" : {
        "refresh_interval_in_millis" : 1000,
        "name" : "Linux",
        "pretty_name" : "Ubuntu 22.04.2 LTS",
        "arch" : "amd64",
        "version" : "5.15.0-67-generic",
        "available_processors" : 16,
        "allocated_processors" : 16
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 422075,
        "mlockall" : false
      },
      "jvm" : {
        "pid" : 422075,
        "version" : "15.0.1",
        "vm_name" : "OpenJDK 64-Bit Server VM",
        "vm_version" : "15.0.1+9",
        "vm_vendor" : "AdoptOpenJDK",
        "bundled_jdk" : true,
        "using_bundled_jdk" : true,
        "start_time_in_millis" : 1677873247146,
        "mem" : {
          "heap_init_in_bytes" : 2147483648,
          "heap_max_in_bytes" : 2147483648,
          "non_heap_init_in_bytes" : 7667712,
          "non_heap_max_in_bytes" : 0,
          "direct_max_in_bytes" : 0
        },
        "gc_collectors" : [
          "G1 Young Generation",
          "G1 Old Generation"
        ],
        "memory_pools" : [
          "CodeHeap 'non-nmethods'",
          "Metaspace",
          "CodeHeap 'profiled nmethods'",
          "Compressed Class Space",
          "G1 Eden Space",
          "G1 Old Gen",
          "G1 Survivor Space",
          "CodeHeap 'non-profiled nmethods'"
        ],
        "using_compressed_ordinary_object_pointers" : "true",
        "input_arguments" : [
          "-Xshare:auto",
          "-Des.networkaddress.cache.ttl=60",
          "-Des.networkaddress.cache.negative.ttl=10",
          "-XX:+AlwaysPreTouch",
          "-Xss1m",
          "-Djava.awt.headless=true",
          "-Dfile.encoding=UTF-8",
          "-Djna.nosys=true",
          "-XX:-OmitStackTraceInFastThrow",
          "-XX:+ShowCodeDetailsInExceptionMessages",
          "-Dio.netty.noUnsafe=true",
          "-Dio.netty.noKeySetOptimization=true",
          "-Dio.netty.recycler.maxCapacityPerThread=0",
          "-Dio.netty.allocator.numDirectArenas=0",
          "-Dlog4j.shutdownHookEnabled=false",
          "-Dlog4j2.disable.jmx=true",
          "-Djava.locale.providers=SPI,COMPAT",
          "-Xms2g",
          "-Xmx2g",
          "-XX:+UseG1GC",
          "-XX:G1ReservePercent=25",
          "-XX:InitiatingHeapOccupancyPercent=30",
          "-Des.networkaddress.cache.ttl=60",
          "-Des.networkaddress.cache.negative.ttl=10",
          "-XX:+AlwaysPreTouch",
          "-Xss1m",
          "-Djava.awt.headless=true",
          "-Dfile.encoding=UTF-8",
          "-Djna.nosys=true",
          "-XX:-OmitStackTraceInFastThrow",
          "-XX:+ShowCodeDetailsInExceptionMessages",
          "-Dio.netty.noUnsafe=true",
          "-Dio.netty.noKeySetOptimization=true",
          "-Dio.netty.recycler.maxCapacityPerThread=0",
          "-Dlog4j.shutdownHookEnabled=false",
          "-Dlog4j2.disable.jmx=true",
          "-Dlog4j2.formatMsgNoLookups=true",
          "-Djava.io.tmpdir=/tmp/elasticsearch-5908758710646640467",
          "-XX:+HeapDumpOnOutOfMemoryError",
          "-XX:HeapDumpPath=/var/lib/elasticsearch",
          "-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log",
          "-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m",
          "-Djava.locale.providers=COMPAT",
          "-XX:UseAVX=0",
          "-XX:MaxDirectMemorySize=1073741824",
          "-Des.path.home=/usr/share/elasticsearch",
          "-Des.path.conf=/etc/elasticsearch",
          "-Des.distribution.flavor=default",
          "-Des.distribution.type=deb",
          "-Des.bundled_jdk=true"
        ]
      },
      "thread_pool" : {
        "force_merge" : {
          "type" : "fixed",
          "size" : 1,
          "queue_size" : -1
        },
        "ml_datafeed" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 512,
          "keep_alive" : "1m",
          "queue_size" : -1
        },
        "searchable_snapshots_cache_fetch_async" : {
          "type" : "scaling",
          "core" : 0,
          "max" : 32,
          "keep_alive" : "30s",
          "queue_size" : -1
        },
        "fetch_shard_started" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 32,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "listener" : {
          "type" : "fixed",
          "size" : 8,
          "queue_size" : -1
        },
        "rollup_indexing" : {
          "type" : "fixed",
          "size" : 4,
          "queue_size" : 4
        },
        "search" : {
          "type" : "fixed_auto_queue_size",
          "size" : 25,
          "queue_size" : 1000
        },
        "security-crypto" : {
          "type" : "fixed",
          "size" : 8,
          "queue_size" : 1000
        },
        "ccr" : {
          "type" : "fixed",
          "size" : 32,
          "queue_size" : 100
        },
        "flush" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "fetch_shard_store" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 32,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "ml_utility" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 2048,
          "keep_alive" : "10m",
          "queue_size" : -1
        },
        "get" : {
          "type" : "fixed",
          "size" : 16,
          "queue_size" : 1000
        },
        "system_read" : {
          "type" : "fixed",
          "size" : 5,
          "queue_size" : 2000
        },
        "transform_indexing" : {
          "type" : "fixed",
          "size" : 4,
          "queue_size" : 4
        },
        "write" : {
          "type" : "fixed",
          "size" : 16,
          "queue_size" : 10000
        },
        "watcher" : {
          "type" : "fixed",
          "size" : 50,
          "queue_size" : 1000
        },
        "security-token-key" : {
          "type" : "fixed",
          "size" : 1,
          "queue_size" : 1000
        },
        "refresh" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 8,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "system_write" : {
          "type" : "fixed",
          "size" : 5,
          "queue_size" : 1000
        },
        "generic" : {
          "type" : "scaling",
          "core" : 4,
          "max" : 128,
          "keep_alive" : "30s",
          "queue_size" : -1
        },
        "warmer" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "management" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "analyze" : {
          "type" : "fixed",
          "size" : 1,
          "queue_size" : 16
        },
        "searchable_snapshots_cache_prewarming" : {
          "type" : "scaling",
          "core" : 0,
          "max" : 32,
          "keep_alive" : "30s",
          "queue_size" : -1
        },
        "ml_job_comms" : {
          "type" : "scaling",
          "core" : 4,
          "max" : 2048,
          "keep_alive" : "1m",
          "queue_size" : -1
        },
        "snapshot" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "search_throttled" : {
          "type" : "fixed_auto_queue_size",
          "size" : 1,
          "queue_size" : 100
        }
      },
      "transport" : {
        "bound_address" : [
          "[::1]:9300",
          "127.0.0.1:9300"
        ],
        "publish_address" : "127.0.0.1:9300",
        "profiles" : { }
      },
      "http" : {
        "bound_address" : [
          "[::1]:9200",
          "127.0.0.1:9200"
        ],
        "publish_address" : "127.0.0.1:9200",
        "max_content_length_in_bytes" : 104857600
      },
      "plugins" : [ ],
      "modules" : [
        {
          "name" : "aggs-matrix-stats",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Adds aggregations whose input are a list of numeric fields and output includes a matrix.",
          "classname" : "org.elasticsearch.search.aggregations.matrix.MatrixAggregationPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "analysis-common",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Adds \"built in\" analyzers to Elasticsearch.",
          "classname" : "org.elasticsearch.analysis.common.CommonAnalysisPlugin",
          "extended_plugins" : [
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "constant-keyword",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for the constant-keyword field type, which is a specialization of keyword for the case when all documents have the same value.",
          "classname" : "org.elasticsearch.xpack.constantkeyword.ConstantKeywordMapperPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "flattened",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for the flattened field type, which allows JSON objects to be flattened into a single field.",
          "classname" : "org.elasticsearch.xpack.flattened.FlattenedMapperPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "frozen-indices",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for the frozen indices functionality",
          "classname" : "org.elasticsearch.xpack.frozen.FrozenIndices",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "ingest-common",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for ingest processors that do not require additional security permissions or have large dependencies and resources",
          "classname" : "org.elasticsearch.ingest.common.IngestCommonPlugin",
          "extended_plugins" : [
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "ingest-geoip",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Ingest processor that uses looksup geo data based on ip adresses using the Maxmind geo database",
          "classname" : "org.elasticsearch.ingest.geoip.IngestGeoIpPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "ingest-user-agent",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Ingest processor that extracts information from a user agent",
          "classname" : "org.elasticsearch.ingest.useragent.IngestUserAgentPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "kibana",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Plugin exposing APIs for Kibana system indices",
          "classname" : "org.elasticsearch.kibana.KibanaPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "lang-expression",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Lucene expressions integration for Elasticsearch",
          "classname" : "org.elasticsearch.script.expression.ExpressionPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "lang-mustache",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Mustache scripting integration for Elasticsearch",
          "classname" : "org.elasticsearch.script.mustache.MustachePlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "lang-painless",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "An easy, safe and fast scripting language for Elasticsearch",
          "classname" : "org.elasticsearch.painless.PainlessPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "mapper-extras",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Adds advanced field mappers",
          "classname" : "org.elasticsearch.index.mapper.MapperExtrasPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "mapper-version",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for a field type to store sofware versions",
          "classname" : "org.elasticsearch.xpack.versionfield.VersionFieldPlugin",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "parent-join",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "This module adds the support parent-child queries and aggregations",
          "classname" : "org.elasticsearch.join.ParentJoinPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "percolator",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Percolator module adds capability to index queries and query these queries by specifying documents",
          "classname" : "org.elasticsearch.percolator.PercolatorPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "rank-eval",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Rank Eval module adds APIs to evaluate ranking quality.",
          "classname" : "org.elasticsearch.index.rankeval.RankEvalPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "reindex",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Reindex module adds APIs to reindex from one index to another or update documents in place.",
          "classname" : "org.elasticsearch.index.reindex.ReindexPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "repositories-metering-api",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Repositories metering API",
          "classname" : "org.elasticsearch.xpack.repositories.metering.RepositoriesMeteringPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "repository-url",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for URL repository",
          "classname" : "org.elasticsearch.plugin.repository.url.URLRepositoryPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "search-business-rules",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for applying business rules to search result rankings",
          "classname" : "org.elasticsearch.xpack.searchbusinessrules.SearchBusinessRules",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "searchable-snapshots",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for the searchable snapshots functionality",
          "classname" : "org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "spatial",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for Basic Spatial features",
          "classname" : "org.elasticsearch.xpack.spatial.SpatialPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "systemd",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Integrates Elasticsearch with systemd",
          "classname" : "org.elasticsearch.systemd.SystemdPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "transform",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin to transform data",
          "classname" : "org.elasticsearch.xpack.transform.Transform",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "transport-netty4",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Netty 4 based transport implementation",
          "classname" : "org.elasticsearch.transport.Netty4Plugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "unsigned-long",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for the unsigned long field type",
          "classname" : "org.elasticsearch.xpack.unsignedlong.UnsignedLongMapperPlugin",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "vectors",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for working with vectors",
          "classname" : "org.elasticsearch.xpack.vectors.Vectors",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "wildcard",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for a keyword field type with efficient wildcard search",
          "classname" : "org.elasticsearch.xpack.wildcard.Wildcard",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-analytics",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Analytics",
          "classname" : "org.elasticsearch.xpack.analytics.AnalyticsPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-async",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A module which handles common async operations",
          "classname" : "org.elasticsearch.xpack.async.AsyncResultsIndexPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-async-search",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A module which allows to track the progress of a search asynchronously.",
          "classname" : "org.elasticsearch.xpack.search.AsyncSearch",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-autoscaling",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Autoscaling",
          "classname" : "org.elasticsearch.xpack.autoscaling.Autoscaling",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ccr",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - CCR",
          "classname" : "org.elasticsearch.xpack.ccr.Ccr",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-core",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Core",
          "classname" : "org.elasticsearch.xpack.core.XPackPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-data-streams",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Data Streams",
          "classname" : "org.elasticsearch.xpack.datastreams.DataStreamsPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-deprecation",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Deprecation",
          "classname" : "org.elasticsearch.xpack.deprecation.Deprecation",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-enrich",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Enrich",
          "classname" : "org.elasticsearch.xpack.enrich.EnrichPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-eql",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Elasticsearch plugin that powers EQL for Elasticsearch",
          "classname" : "org.elasticsearch.xpack.eql.plugin.EqlPlugin",
          "extended_plugins" : [
            "x-pack-ql",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-graph",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Graph",
          "classname" : "org.elasticsearch.xpack.graph.Graph",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-identity-provider",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Identity Provider",
          "classname" : "org.elasticsearch.xpack.idp.IdentityProviderPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ilm",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Index Lifecycle Management",
          "classname" : "org.elasticsearch.xpack.ilm.IndexLifecycle",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-logstash",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Logstash",
          "classname" : "org.elasticsearch.xpack.logstash.Logstash",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ml",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Machine Learning",
          "classname" : "org.elasticsearch.xpack.ml.MachineLearning",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : true
        },
        {
          "name" : "x-pack-monitoring",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Monitoring",
          "classname" : "org.elasticsearch.xpack.monitoring.Monitoring",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ql",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch infrastructure plugin for EQL and SQL for Elasticsearch",
          "classname" : "org.elasticsearch.xpack.ql.plugin.QlPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-rollup",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Rollup",
          "classname" : "org.elasticsearch.xpack.rollup.Rollup",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-security",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Security",
          "classname" : "org.elasticsearch.xpack.security.Security",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-sql",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Elasticsearch plugin that powers SQL for Elasticsearch",
          "classname" : "org.elasticsearch.xpack.sql.plugin.SqlPlugin",
          "extended_plugins" : [
            "x-pack-ql",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-stack",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Stack",
          "classname" : "org.elasticsearch.xpack.stack.StackPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-voting-only-node",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Voting-only node",
          "classname" : "org.elasticsearch.cluster.coordination.VotingOnlyNodePlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-watcher",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Watcher",
          "classname" : "org.elasticsearch.xpack.watcher.Watcher",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        }
      ],
      "ingest" : {
        "processors" : [
          {
            "type" : "append"
          },
          {
            "type" : "bytes"
          },
          {
            "type" : "circle"
          },
          {
            "type" : "convert"
          },
          {
            "type" : "csv"
          },
          {
            "type" : "date"
          },
          {
            "type" : "date_index_name"
          },
          {
            "type" : "dissect"
          },
          {
            "type" : "dot_expander"
          },
          {
            "type" : "drop"
          },
          {
            "type" : "enrich"
          },
          {
            "type" : "fail"
          },
          {
            "type" : "foreach"
          },
          {
            "type" : "geoip"
          },
          {
            "type" : "grok"
          },
          {
            "type" : "gsub"
          },
          {
            "type" : "html_strip"
          },
          {
            "type" : "inference"
          },
          {
            "type" : "join"
          },
          {
            "type" : "json"
          },
          {
            "type" : "kv"
          },
          {
            "type" : "lowercase"
          },
          {
            "type" : "pipeline"
          },
          {
            "type" : "remove"
          },
          {
            "type" : "rename"
          },
          {
            "type" : "script"
          },
          {
            "type" : "set"
          },
          {
            "type" : "set_security_user"
          },
          {
            "type" : "sort"
          },
          {
            "type" : "split"
          },
          {
            "type" : "trim"
          },
          {
            "type" : "uppercase"
          },
          {
            "type" : "urldecode"
          },
          {
            "type" : "user_agent"
          }
        ]
      },
      "aggregations" : {
        "adjacency_matrix" : {
          "types" : [
            "other"
          ]
        },
        "auto_date_histogram" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "avg" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "boxplot" : {
          "types" : [
            "histogram",
            "numeric"
          ]
        },
        "cardinality" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "geopoint",
            "geoshape",
            "ip",
            "numeric",
            "range"
          ]
        },
        "children" : {
          "types" : [
            "other"
          ]
        },
        "composite" : {
          "types" : [
            "other"
          ]
        },
        "date_histogram" : {
          "types" : [
            "boolean",
            "date",
            "numeric",
            "range"
          ]
        },
        "date_range" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "diversified_sampler" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "numeric"
          ]
        },
        "extended_stats" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "filter" : {
          "types" : [
            "other"
          ]
        },
        "filters" : {
          "types" : [
            "other"
          ]
        },
        "geo_bounds" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "geo_centroid" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "geo_distance" : {
          "types" : [
            "geopoint"
          ]
        },
        "geohash_grid" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "geotile_grid" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "global" : {
          "types" : [
            "other"
          ]
        },
        "histogram" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric",
            "range"
          ]
        },
        "ip_range" : {
          "types" : [
            "ip"
          ]
        },
        "matrix_stats" : {
          "types" : [
            "other"
          ]
        },
        "max" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "median_absolute_deviation" : {
          "types" : [
            "numeric"
          ]
        },
        "min" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "missing" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "geopoint",
            "ip",
            "numeric",
            "range"
          ]
        },
        "nested" : {
          "types" : [
            "other"
          ]
        },
        "parent" : {
          "types" : [
            "other"
          ]
        },
        "percentile_ranks" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "percentiles" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "range" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "rare_terms" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "ip",
            "numeric"
          ]
        },
        "rate" : {
          "types" : [
            "boolean",
            "numeric"
          ]
        },
        "reverse_nested" : {
          "types" : [
            "other"
          ]
        },
        "sampler" : {
          "types" : [
            "other"
          ]
        },
        "scripted_metric" : {
          "types" : [
            "other"
          ]
        },
        "significant_terms" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "ip",
            "numeric"
          ]
        },
        "significant_text" : {
          "types" : [
            "other"
          ]
        },
        "stats" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "string_stats" : {
          "types" : [
            "bytes"
          ]
        },
        "sum" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "t_test" : {
          "types" : [
            "numeric"
          ]
        },
        "terms" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "ip",
            "numeric"
          ]
        },
        "top_hits" : {
          "types" : [
            "other"
          ]
        },
        "top_metrics" : {
          "types" : [
            "other"
          ]
        },
        "value_count" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "geopoint",
            "geoshape",
            "histogram",
            "ip",
            "numeric",
            "range"
          ]
        },
        "variable_width_histogram" : {
          "types" : [
            "numeric"
          ]
        },
        "weighted_avg" : {
          "types" : [
            "numeric"
          ]
        }
      }
    }
  }
}
</syntaxhighlight>
{{collapse/end}}
 
=== Tweaks ===
 
Elasticsearch could use huge amount of RAM. But, I've tested it for thin instances it work even with only <code>128m</code>. The main configuration files are located into the directory <code>/etc/elasticsearch/</code>. You can tweak the amount of Ram in use by tweaking the relevant lines in the file <code>jvm.options</code>. Note <code>Xms</code> and <code>Xmx</code> values must be equal.<syntaxhighlight lang="shell" line="1">
sudo nano /etc/elasticsearch/jvm.options
</syntaxhighlight><syntaxhighlight lang="bash">
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
 
#-Xms512m
#-Xmx512m
-Xms4g
-Xmx4g
</syntaxhighlight>Add restart always directive to the Elasticsearch's systemd unit.<syntaxhighlight lang="shell" line="1">
sudo systemctl edit elasticsearch.service
</syntaxhighlight><syntaxhighlight lang="bash">
[Service]
# SZS/MLT Tweak
Restart=always
RestartSec=3
</syntaxhighlight>To apply the changes use the following commands.<syntaxhighlight lang="shell" line="1">
sudo systemctl daemon-reload
sudo systemctl restart elasticsearch.service
systemctl status elasticsearch.service
systemctl cat elasticsearch.service
</syntaxhighlight>
 
== MediaWiki Setup ==
The main purpose of this guide is how to setup Elasticsearch to be used by MediaWiki's extension [[mw:Extension:CirrusSearch|'''CirrusSearch''']], so in this section we will describe how to do that. In addition also the extension [[mw:Extension:AdvancedSearch|AdvancedSearch]] will be installed and configured.
 
If you have installed the extension [[mw:Extension:PdfHandler|PdfHandler]] (or some other file handling extension) CirrusSearch will show results from the files content - in the configuration below is shown how to boost these results. How to configure extension Translate to use Elasticsearch is decried in the MediaWiki's documentation in the article [[mw:Help:Extension:Translate/Translation memories#ElasticSearch%20backend|Translation memories]].
 
=== Install the Extensions Bundle ===
First of all you need to install the extensions within the MediaWiki's document root. In the following example is used the approach [[mw:Download from Git|Download from Git]].
 
<syntaxhighlight lang="shell" line="1" class="code-continue mlw-shell-gray">
IP="/var/www/wiki.example.com" # The DocumentRoot directory of the wiki
OWNER="www-data"              # The user that owns the $IP directory
BRANCH="REL1_39"              # The MediaWiki's branch in use
</syntaxhighlight>
<syntaxhighlight lang="shell" line="1">
cd "$IP/extensions"
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/AdvancedSearch --branch ${BRANCH}
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/Elastica --branch ${BRANCH}
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/CirrusSearch --branch ${BRANCH}
sudo chown -R ${Owner}:${Owner} Elastica/ CirrusSearch/
for ext in Elastica CirrusSearch; do sudo -u ${Owner} composer install --no-dev; done
</syntaxhighlight>
 
=== LocalSettings.php Configuration ===
Open the configuration file with your favorite editor and place the following lines at suitable place (the end of the file is good place). In the example below is shown the current configuration of this wiki. After the building of the search index (next section) CirrusSearch should work without the advanced setup. More options are described in the [[mw:Extension:CirrusSearch#Configuration|Extension:CirrusSearch]] page, also some undocumented options could be found within its <code>[https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/CirrusSearch/+/HEAD/extension.json extension.json]</code> file.<syntaxhighlight lang="shell" line="1">
sudo nano "$IP/LocalSettings.php"
</syntaxhighlight>
<syntaxhighlight lang="php" line="1" start="750" class="mlw-pre-max-height-320">
## Extension:AdvancedSearch
wfLoadExtension( 'AdvancedSearch' );
$wgAdvancedSearchDeepcatEnabled = false;    // https://www.mediawiki.org/wiki/Topic:Uw036nwsilvb6w3t
$wgAdvancedSearchBetaFeature = false;      // (enable it by default) https://m.mediawiki.org/wiki/Topic:Upflskaswcvrunka
$wgAdvancedSearchHighlighting = true;      // https://www.mediawiki.org/wiki/Manual:Configuration_settings_(alphabetical)
$wgOpenSearchDescriptionLength = 2500;      // https://www.mediawiki.org/wiki/Manual:$wgOpenSearchDescriptionLength
 
## Extension:Elastica
wfLoadExtension( 'Elastica' );
 
## Extension:CirrusSearch
wfLoadExtension( 'CirrusSearch' );
// $wgDisableSearchUpdate = true;
$wgSearchType = 'CirrusSearch';
$wgDebugLogGroups['CirrusSearch'] = "$IP/cache/CirrusSearch.log";
// $wgCirrusSearchIndexBaseName = 'wiki_db_name';      // https://www.mediawiki.org/wiki/Extension:CirrusSearch#Configuration
// $wgCirrusSearchServers = [ '10.120.201.1' ];        // The address of the Elasticsearch serer if it is not available at 'localhost'
 
## Extension:CirrusSearch Advanced Setup
$wgCirrusSearchRescoreProfile = 'classic_noboostlinks';
// $wgCirrusSearchFullTextQueryBuilderProfiles = 'perfield_builder';
// $wgCirrusSearchCompletionProfiles = 'normal';
$wgCirrusSearchPhraseSuggestUseText = true;
$wgCirrusSearchCompletionSuggesterHardLimit = 200; // 50
$wgCirrusSearchFragmentSize = 200;
// $wgCirrusExploreSimilarResults = true;
 
// Give much weight to the "file_text" in order to show
// results from the PDFs content. This requires PdfHandler
$wgCirrusSearchWeights = [
    "title" => 20,
    "redirect" => 15,
    "category" => 8,
    "heading" => 5,
    "opening_text" => 3,
    "text" => 5,
    "auxiliary_text" => 15,
    "file_text" => 25
];
 
// https://www.mediawiki.org/wiki/Help:Namespaces#Localisation
$wgCirrusSearchNamespaceWeights = [
    "2" => 0.05,
    "4" => 0.3,
    "6" => 0.2,
    "8" => 0.05,
    "10" => 0.005,
    "12" => 0.2,
    "14" => 0.1
];
</syntaxhighlight>
 
=== Build Search Index ===
How to build and update the CirrusSearch/Elasticsearc index is well described in the documents [https://gerrit.wikimedia.org/g/mediawiki/extensions/CirrusSearch/%2B/HEAD/README README] and [https://gerrit.wikimedia.org/g/mediawiki/extensions/CirrusSearch/%2B/HEAD/UPGRADE UPGRADE] which comes with the extension. Here the important steps related to the index building for a first time are extracted and converted to a script - for a single wiki and for a wiki family.
<syntaxhighlight lang="shell" class="code-continue" line="1">
sudo nano /usr/local/bin/"mlw-cirrussearch-elasticsearch-build-index-single-wiki.sh"
</syntaxhighlight>
<syntaxhighlight lang="bash" line="1" class="mlw-pre-max-height-320">
#!/bin/bash
 
# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license  https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @source    https://phabricator.wikimedia.org/source/extension-cirrussearch/browse/master/README
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install  Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-cirrussearch-elasticsearch-build-index-single-wiki.sh
#
# @desc      Create elastic search index for a singel MediaWiki
 
: ${IP:="/var/www/wiki.example.com"} # The DocumentRoot directory of the wiki
: ${OWNER:="www-data"}              # The user that owns the $IP directory
 
CS_MAINT_DIR="${IP}/extensions/CirrusSearch/maintenance"
 
## STEP 1
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Disable CirrusSearch amd Search update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgSearchType#// $wgSearchType#' "${IP}/LocalSettings.php"
sudo -u "$OWNER" sed -i 's#^// $wgDisableSearchUpdate#$wgDisableSearchUpdate#' "${IP}/LocalSettings.php"
 
printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5
 
## STEP 2
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Generate Elasticsearch Index" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSearchIndexConfig.php" --startOver --conf "${IP}/LocalSettings.php"
 
## STEP 3
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Search Update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgDisableSearchUpdate#// $wgDisableSearchUpdate#' "${IP}/LocalSettings.php"
 
printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5
 
## STEP 4
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Bootstrap the Search Index" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipLinks --indexOnSkip --conf "${IP}/LocalSettings.php"
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipParse --conf "${IP}/LocalSettings.php"
 
## STEP 5
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Cirrus Search" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^// $wgSearchType#$wgSearchType#' "${IP}/LocalSettings.php"
 
printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5
 
## Step 6
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Update Cirrus Search Suggestions (if the option is enabled in LocalSettings.php)" "${IP##*/}"
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSuggesterIndex.php" --conf "${IP}/LocalSettings.php"
 
## Step 7 - this is the most time consumption step, you cold skip it and run it later...
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Run Jobs Quiue" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" /usr/bin/php "${IP}/maintenance/runJobs.php" --conf "${IP}/LocalSettings.php"
</syntaxhighlight><syntaxhighlight lang="shell" class="code-continue" line="1">
sudo nano /usr/local/bin/"mlw-cirrussearch-elasticsearch-build-index-wiki-family.sh"
</syntaxhighlight>
<syntaxhighlight lang="bash" line="1" class="mlw-pre-max-height-320">
#!/bin/bash
 
# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license  https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @source    https://phabricator.wikimedia.org/source/extension-cirrussearch/browse/master/README
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install  Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-cirrussearch-elasticsearch-build-index-wiki-family.sh
#
# @desc      Create elastic search index for a MediaWiki Family
#            Note: In this scenariou the family members share the same DocumentRoot
# and LocalSettings.php file (and Apache2 virtual host configuration).
 
: ${IP:="/var/www/wiki-family.example.com"} # The DocumentRoot directory of the wiki
: ${OWNER:="www-data"}                      # The user that owns the $IP directory
: ${WIKI_IDs:="bg" "en" "ru" "commons"}    # The user that owns the $IP directory
 
CS_MAINT_DIR="${IP}/extensions/CirrusSearch/maintenance"
 
## STEP 1
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Disable CirrusSearch amd Search update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgSearchType#// $wgSearchType#' "${IP}/LocalSettings.php"
sudo -u "$OWNER" sed -i 's#^// $wgDisableSearchUpdate#$wgDisableSearchUpdate#' "${IP}/LocalSettings.php"
 
printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5
 
## STEP 2
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Generate Elasticsearch Index" "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSearchIndexConfig.php" --startOver --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
done
 
## STEP 3
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Search Update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgDisableSearchUpdate#// $wgDisableSearchUpdate#' "${IP}/LocalSettings.php"
 
printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5
 
## STEP 4
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Bootstrap the Search Index" "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipLinks --indexOnSkip --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipParse --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
    sleep 2
done
 
## STEP 5
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Cirrus Search" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^// $wgSearchType#$wgSearchType#' "${IP}/LocalSettings.php"
 
printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5
 
## Step 6
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Update Cirrus Search Suggestions (if the option is enabled in LocalSettings.php)" "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSuggesterIndex.php" --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
done
 
## Step 7 - this is the most time consumption step, you cold skip it and run it later...
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Run Jobs Quiue"  "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${IP}/maintenance/runJobs.php" --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
done
 
 
</syntaxhighlight>
 
== Update ==
When you are update CirrusSearch or/and Elasticsearch there are several possible cases, which are well described in the CirrsusSearch's Documentation files [https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/README README] and [https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/UPGRADE UPGRADE].
 
One possible way is to use the extensions maintenance script as it is show below, or you can rebuild the entire search index as it is shown above :)
 
<syntaxhighlight lang="shell" line="1">
cd "$IP/extensions/CirrusSearch"
sudo -u $OWNER php maintenance/UpdateSearchIndexConfig.php --reindexAndRemoveOk --indexIdentifier now
sudo -u $OWNER php maintenance/Metastore.php --upgrade
</syntaxhighlight>To monitor the filesystem changes during the update you can use a command like the following.<syntaxhighlight lang="shell" line="1" class="multi-line-cmd mlw-shell-gray">
sudo watch "du -hs /var/lib/mysql/wiki_id && \
du -hs /var/lib/elasticsearch && \
php /var/www/wiki.metalevel.tech/maintenance/showJobs.php --type cirrusSearchElasticaWrite"
 
</syntaxhighlight>
 
== Additional Setup ==
 
=== Access Elasticsearch via SSH Tunnel ===
Using such approach is suitable only for test purpose, here is a manual how to set-up:
 
* [[SSH Persistent Tunnel and SSHFS Mount via "systemd" units]].
 
=== Elasticsearch Watch Scripts ===
Here are two example scripts that cover the following scenarios: [1] When the Elasticsearch service is used on the same instance where it is used; and [2] When the Elasticsearch service is used on another host (instance) and we must be sure it is available there.<syntaxhighlight lang="shell" class="code-continue" line="1">
sudo nano /usr/local/bin/"mlw-elasticsearch-watch-local.sh"
</syntaxhighlight>
<syntaxhighlight lang="bash" line="1" class="mlw-pre-max-height-320">
#!/bin/bash -e
 
# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license  https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install  Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-elasticsearch-watch-local.sh
#
# @desc      Test wheather Elasticsearch is accessible, if not attempt to restar and send email notification
 
: ${EMAIL_SENDER:="admin@example.com"}                # The email address of the reponsible person
: ${EMAIL_ADMIN:="admin@example.com"}                  # The email address which sends the email
: ${EMAIL_BODY:="/tmp/elasticsearch-watch.email.body"} # Temporary file where the email body will be stored
 
if /usr/bin/curl 'http://127.0.0.1:9200' 2>&1 | /bin/grep -q 'Connection refused'
then
    {
        /bin/date
        echo
        echo "ElasticSearch fail and will be restarted..."
        /usr/bin/systemctl start elasticsearch.service
        /usr/bin/systemctl restart elasticsearch.service
    } > "$EMAIL_BODY" 2>&1
 
    /usr/bin/mail  -r "ElasticSearch Watch ${EMAIL_ADMIN}" \
                    -s "ElasticSearch was Restarted" "${EMAIL_SENDER}" \
                    -a "MIME-Version: 1.0" -a "Content-Type: text/html; charset=UTF-8" < "$EMAIL_BODY"
fi
 
</syntaxhighlight><syntaxhighlight lang="shell" class="code-continue" line="1">
sudo nano /usr/local/bin/"mlw-elasticsearch-watch-remote.sh"
</syntaxhighlight>
<syntaxhighlight lang="bash" line="1" class="mlw-pre-max-height-320">
#!/bin/bash -e
 
# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2021 Spas Z. Spasov
# @license  https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install  Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-elasticsearch-watch-remote.sh
#
# @desc      Test wheather Elasticsearch is accessible, if not attempt to restar and send email notification
#            Here the test is done via SSH login to a remote instance where Elasticsearch is used
 
: ${EMAIL_SENDER:="admin@example.com"}                # The email address of the reponsible person
: ${EMAIL_ADMIN:="admin@example.com"}                  # The email address which sends the email
: ${EMAIL_BODY:="/tmp/elasticsearch-watch.email.body"} # Temporary file where the email body will be stored
: ${HOSTNAME:="example.com"}                          # A hostname defined in the ssh/config file
 
if /usr/bin/ssh "$HOSTNAME" "curl 'http://127.0.0.1:9200' 2>&1" | /bin/grep -q 'Connection refused'
then
    {
        /bin/date
        echo
        echo "ElasticSearch on remote instance - ${HOSTNAME}, and will be restarted..."
        /usr/bin/systemctl start autossh-port-forward.service
        /usr/bin/systemctl start elasticsearch.service
        /usr/bin/systemctl restart autossh-trivictoria.service
        /usr/bin/systemctl restart elasticsearch.service
    } > "$EMAIL_BODY" 2>&1
 
    /usr/bin/mail  -r "ElasticSearch Watch ${EMAIL_ADMIN}" \
                    -s "ElasticSearch was Restarted" "${EMAIL_SENDER}" \
                    -a "MIME-Version: 1.0" -a "Content-Type: text/html; charset=UTF-8" < "$EMAIL_BODY"
fi
 
</syntaxhighlight>To run the test periodically you can create simple systemd service and timer. A pretty simple example of this approach could be found within the [https://docs.gunicorn.org/en/latest/deploy.html#systemd Gunicorn's documentation]. Another way is to create a <code>crontab</code> entry as the follow.<syntaxhighlight lang="shell" line="1">
sudo crontab -e
sudo crontab -e
</syntaxhighlight><syntaxhighlight lang="bash">
</syntaxhighlight><syntaxhighlight lang="bash">
# ElasticSearch Watch
# ElasticSearch Watch
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh
</syntaxhighlight>
=== Disable CirrusSearch via PHP if Elasticsearch is not available ===
Another thing that could be done in order to be sure you wiki works correct is to check whether Elasticsearch is available within <code>LocalSettings.php</code>. This could be done by an implementation of the following code.<syntaxhighlight lang="php" line="1">
<?php
function isPortOpen($ipAddress, $portToCheck) {
    $fp = @fsockopen($ipAddress, $portToCheck, $errno, $errstr, 0.1);
    if (!$fp) {
        return false;
    } else {
        fclose($fp);
        return true;
    }
}
if (isPortOpen('127.0.0.1', 9300)) {
    echo '9300 Open';
} else {
    echo '9300 Closed';
}
$wgSearchType = 'CirrusSearch';
</syntaxhighlight>The <code>LocalSettings.php</code> implementation could look like:<syntaxhighlight lang="php" line="1" start="763">
if (@fsockopen('127.0.0.1', 9300, $errno, $errstr, 0.1)) {
fclose($fp);
$wgSearchType = 'CirrusSearch';
}
</syntaxhighlight>
</syntaxhighlight>


Line 147: Line 1,923:
* Elasticsearch Documentation: [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/gc-logging.html#gc-logging GC logging]
* Elasticsearch Documentation: [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/gc-logging.html#gc-logging GC logging]


== Access Elasticsearch via SSH ==
<noinclude>
 
* [[SSH Persistent Tunnel and SSHFS Mount via "systemd" units]]
 
----
 
== -REVIEW- ==
 
=== ElasticSearch ===
Required by [[mediawikiwiki:Topic:Vo4jribwur9xzm9z|MW:Extension:CirrusSearch]], some other mw:extensions and some extensions of [[NextCloud Installation|NextCloud]].<syntaxhighlight lang="bash">
~/Downloads/elastic-search/
sido apt install ./elasticsearch-5.6.16.deb
</syntaxhighlight>Добавяне на функция за автоматичен рестарт за <code>elasticsearch.service</code>, тъй като по някой път се „чупи“:<syntaxhighlight lang="bash">
sudo nano /lib/systemd/system/elasticsearch.service
 
    [Service]
    ...
    Restart=always
    RestartSec=3
</syntaxhighlight>
 
=== Java ===
 
Downgrade required by [[mediawikiwiki:Topic:Vo4jribwur9xzm9z|MW:Extension:CirrusSearch]] and ElasticSearch 5.6.16.<noinclude>
<div id='devStage'>
<div id='devStage'>
{{devStage  
{{devStage  
Line 176: Line 1,929:
  | Прндл1 = DevOps and SRE
  | Прндл1 = DevOps and SRE
  | Прндл2 = Linux Server
  | Прндл2 = Linux Server
  | Стадий = 3
  | Стадий = 6
  | Фаза  = Разработване
  | Фаза  = Утвърждаване
  | Статус = Разработван
  | Статус = Утвърден
  | ИдтПт  = Spas
  | ИдтПт  = Spas
  | РзбПт  = {{REVISIONUSER}}
  | РзбПт = Spas
| АвтПт  = Spas
| УтвПт = {{REVISIONUSER}}
  | ИдтДт  = 5.07.2022
  | ИдтДт  = 5.07.2022
  | РзбДт  = {{Today}}
  | РзбДт = 5.03.2023
| АвтДт  = 5.03.2023
| УтвДт = {{Today}}
  | ИдтРв  = [[Special:Permalink/27720|27720]]
  | ИдтРв  = [[Special:Permalink/27720|27720]]
  | РзбРв  = {{REVISIONID}}
  | РзбРв = [[Special:Permalink/32371|32371]]
| АвтРв  = [[Special:Permalink/32375|32375]]
| РзАРв  = [[Special:Permalink/32347|32347]]
| УтвРв = {{REVISIONID}}
| РзУРв  = [[Special:Permalink/32358|32358]]
}}
}}
</div>
</div>
</noinclude>
</noinclude>

Latest revision as of 20:22, 5 March 2023

This is a short man­u­al how to set-up Elas­tic­search to be used with the MediaWiki's ex­ten­sion Cir­rusSearch which com­mu­ni­cate to the ser­vice by the ex­ten­sion Elas­ti­ca. You should choice an ap­pro­pri­ate Elas­tic­search ver­sion de­pend­ing on your Me­di­aWi­ki ver­sion. Cur­rent­ly I'm us­ing Me­di­aWi­ki 1.39 and it is rec­om­mend­ed to use Elas­tic­search 7.10.2 with it. This ver­sion runs well over open­jdk-11 which is the de­fault Ja­va ver­sion on Ubun­tu Serv­er 22.04.

Elas­tic­search and the ex­ten­sion Elas­ti­ca are re­quired by some oth­er Me­di­aWi­ki ex­ten­sions as ex­ten­sion Trans­late where it is used as trans­la­tion mem­o­ry. It is al­so used by the NextCoud's ap­pli­ca­tion Full text search and more…

See al­so Me­di­aWi­ki Job Queue.

Ja­va Set­up

On Ubun­tu Serv­er the de­fault jdk and jre pack­ages can be in­stalled by the fol­low­ing com­mand.

sudo apt install -y apt-transport-https default-jdk default-jre

To check and switch the cur­rent ver­sion of Ja­va and Javac you can use the fol­low­ing com­mands.

sudo update-alternatives --config java
#Out­put
There are 2 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                                            Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-11-openjdk-amd64/bin/java      1111      auto mode
* 1            /usr/lib/jvm/java-11-openjdk-amd64/bin/java      1111      manual mode
  2            /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java   1081      manual mode

Press <enter> to keep the current choice[*], or type selection number: 1
sudo update-alternatives --config javac
#Out­put
There are 2 choices for the alternative javac (providing /usr/bin/javac).

  Selection    Path                                          Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-11-openjdk-amd64/bin/javac   1111      auto mode
* 1            /usr/lib/jvm/java-11-openjdk-amd64/bin/javac   1111      manual mode
  2            /usr/lib/jvm/java-8-openjdk-amd64/bin/javac    1081      manual mode

Press <enter> to keep the current choice[*], or type selection number: 1

If you are us­ing Elas­tic­search 5.x it re­quires openjdk‑8 which can be in­stalled by the fol­low­ing com­mands. Af­ter the in­stal­la­tion use the above com­mands to switch the ver­sion in use.

#De­tails
sudo apt install openjdk-8-jre-headless 
sudo apt install openjdk-8-jdk-headless

Af­ter switch­ing the ver­sion of Ja­va you need to restart the Elas­tic­search ser­vice if it is al­ready in­stalled.

sudo systemctl restart elasticsearch.service 
curl 'http://127.0.0.1:9200' # do a test

Elas­tic­search Set­up

In­stal­la­tion

There is a cou­ple of ways how to In­stalling Elas­tic­search – via Dock­er, via Apt repos­i­to­ry, via .deb or .rpm pack­ages, etc. I pre­fer to man­u­al­ly down­load and in­stall it via .deb pack­age. As we said be­fore for Me­di­aWi­ki 1.39+ we need Elas­tic­search ver­sion 7.10.2.

cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-amd64.deb
sudo apt install ./elasticsearch-7.10.2-amd64.deb
#Ver­sions
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb

Af­ter in­stalling the pack­age the Elas­tic­search ser­vice must be en­abled and start­ed.

sudo systemctl enable --now elasticsearch.service   # enable and start the service
systemctl status elasticsearch.service              # check the status of the service
systemctl cat elasticsearch.service                 # check the current service's configuration

Check

You can check does the ser­vice work prop­er­ly by the fol­low­ing ap­proach.

curl 'http://127.0.0.1:9200'
#Out­put
{
  "name" : "metalevel.tech",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "znG-mCHAQU6L3oVR9UIthg",
  "version" : {
    "number" : "7.10.2",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
    "build_date" : "2021-01-13T00:42:12.435326Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

More de­tailed in­for­ma­tion can be ob­tained by the next com­mand.

curl -XGET 'http://localhost:9200/_nodes?pretty'
#Out­put
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "BNfFzNWMTF20Xd5nlcwt6w" : {
      "name" : "metalevel.tech",
      "transport_address" : "127.0.0.1:9300",
      "host" : "127.0.0.1",
      "ip" : "127.0.0.1",
      "version" : "7.10.2",
      "build_flavor" : "default",
      "build_type" : "deb",
      "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
      "total_indexing_buffer" : 214748364,
      "roles" : [
        "data",
        "data_cold",
        "data_content",
        "data_hot",
        "data_warm",
        "ingest",
        "master",
        "ml",
        "remote_cluster_client",
        "transform"
      ],
      "attributes" : {
        "ml.machine_memory" : "25112887296",
        "xpack.installed" : "true",
        "transform.node" : "true",
        "ml.max_open_jobs" : "20"
      },
      "settings" : {
        "client" : {
          "type" : "node"
        },
        "cluster" : {
          "name" : "elasticsearch",
          "election" : {
            "strategy" : "supports_voting_only"
          }
        },
        "http" : {
          "type" : "security4",
          "type.default" : "netty4"
        },
        "node" : {
          "attr" : {
            "transform" : {
              "node" : "true"
            },
            "xpack" : {
              "installed" : "true"
            },
            "ml" : {
              "machine_memory" : "25112887296",
              "max_open_jobs" : "20"
            }
          },
          "name" : "metalevel.tech",
          "pidfile" : "/var/run/elasticsearch/elasticsearch.pid"
        },
        "path" : {
          "data" : [
            "/var/lib/elasticsearch"
          ],
          "logs" : "/var/log/elasticsearch",
          "home" : "/usr/share/elasticsearch"
        },
        "transport" : {
          "type" : "security4",
          "features" : {
            "x-pack" : "true"
          },
          "type.default" : "netty4"
        }
      },
      "os" : {
        "refresh_interval_in_millis" : 1000,
        "name" : "Linux",
        "pretty_name" : "Ubuntu 22.04.2 LTS",
        "arch" : "amd64",
        "version" : "5.15.0-67-generic",
        "available_processors" : 16,
        "allocated_processors" : 16
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 422075,
        "mlockall" : false
      },
      "jvm" : {
        "pid" : 422075,
        "version" : "15.0.1",
        "vm_name" : "OpenJDK 64-Bit Server VM",
        "vm_version" : "15.0.1+9",
        "vm_vendor" : "AdoptOpenJDK",
        "bundled_jdk" : true,
        "using_bundled_jdk" : true,
        "start_time_in_millis" : 1677873247146,
        "mem" : {
          "heap_init_in_bytes" : 2147483648,
          "heap_max_in_bytes" : 2147483648,
          "non_heap_init_in_bytes" : 7667712,
          "non_heap_max_in_bytes" : 0,
          "direct_max_in_bytes" : 0
        },
        "gc_collectors" : [
          "G1 Young Generation",
          "G1 Old Generation"
        ],
        "memory_pools" : [
          "CodeHeap 'non-nmethods'",
          "Metaspace",
          "CodeHeap 'profiled nmethods'",
          "Compressed Class Space",
          "G1 Eden Space",
          "G1 Old Gen",
          "G1 Survivor Space",
          "CodeHeap 'non-profiled nmethods'"
        ],
        "using_compressed_ordinary_object_pointers" : "true",
        "input_arguments" : [
          "-Xshare:auto",
          "-Des.networkaddress.cache.ttl=60",
          "-Des.networkaddress.cache.negative.ttl=10",
          "-XX:+AlwaysPreTouch",
          "-Xss1m",
          "-Djava.awt.headless=true",
          "-Dfile.encoding=UTF-8",
          "-Djna.nosys=true",
          "-XX:-OmitStackTraceInFastThrow",
          "-XX:+ShowCodeDetailsInExceptionMessages",
          "-Dio.netty.noUnsafe=true",
          "-Dio.netty.noKeySetOptimization=true",
          "-Dio.netty.recycler.maxCapacityPerThread=0",
          "-Dio.netty.allocator.numDirectArenas=0",
          "-Dlog4j.shutdownHookEnabled=false",
          "-Dlog4j2.disable.jmx=true",
          "-Djava.locale.providers=SPI,COMPAT",
          "-Xms2g",
          "-Xmx2g",
          "-XX:+UseG1GC",
          "-XX:G1ReservePercent=25",
          "-XX:InitiatingHeapOccupancyPercent=30",
          "-Des.networkaddress.cache.ttl=60",
          "-Des.networkaddress.cache.negative.ttl=10",
          "-XX:+AlwaysPreTouch",
          "-Xss1m",
          "-Djava.awt.headless=true",
          "-Dfile.encoding=UTF-8",
          "-Djna.nosys=true",
          "-XX:-OmitStackTraceInFastThrow",
          "-XX:+ShowCodeDetailsInExceptionMessages",
          "-Dio.netty.noUnsafe=true",
          "-Dio.netty.noKeySetOptimization=true",
          "-Dio.netty.recycler.maxCapacityPerThread=0",
          "-Dlog4j.shutdownHookEnabled=false",
          "-Dlog4j2.disable.jmx=true",
          "-Dlog4j2.formatMsgNoLookups=true",
          "-Djava.io.tmpdir=/tmp/elasticsearch-5908758710646640467",
          "-XX:+HeapDumpOnOutOfMemoryError",
          "-XX:HeapDumpPath=/var/lib/elasticsearch",
          "-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log",
          "-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m",
          "-Djava.locale.providers=COMPAT",
          "-XX:UseAVX=0",
          "-XX:MaxDirectMemorySize=1073741824",
          "-Des.path.home=/usr/share/elasticsearch",
          "-Des.path.conf=/etc/elasticsearch",
          "-Des.distribution.flavor=default",
          "-Des.distribution.type=deb",
          "-Des.bundled_jdk=true"
        ]
      },
      "thread_pool" : {
        "force_merge" : {
          "type" : "fixed",
          "size" : 1,
          "queue_size" : -1
        },
        "ml_datafeed" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 512,
          "keep_alive" : "1m",
          "queue_size" : -1
        },
        "searchable_snapshots_cache_fetch_async" : {
          "type" : "scaling",
          "core" : 0,
          "max" : 32,
          "keep_alive" : "30s",
          "queue_size" : -1
        },
        "fetch_shard_started" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 32,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "listener" : {
          "type" : "fixed",
          "size" : 8,
          "queue_size" : -1
        },
        "rollup_indexing" : {
          "type" : "fixed",
          "size" : 4,
          "queue_size" : 4
        },
        "search" : {
          "type" : "fixed_auto_queue_size",
          "size" : 25,
          "queue_size" : 1000
        },
        "security-crypto" : {
          "type" : "fixed",
          "size" : 8,
          "queue_size" : 1000
        },
        "ccr" : {
          "type" : "fixed",
          "size" : 32,
          "queue_size" : 100
        },
        "flush" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "fetch_shard_store" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 32,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "ml_utility" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 2048,
          "keep_alive" : "10m",
          "queue_size" : -1
        },
        "get" : {
          "type" : "fixed",
          "size" : 16,
          "queue_size" : 1000
        },
        "system_read" : {
          "type" : "fixed",
          "size" : 5,
          "queue_size" : 2000
        },
        "transform_indexing" : {
          "type" : "fixed",
          "size" : 4,
          "queue_size" : 4
        },
        "write" : {
          "type" : "fixed",
          "size" : 16,
          "queue_size" : 10000
        },
        "watcher" : {
          "type" : "fixed",
          "size" : 50,
          "queue_size" : 1000
        },
        "security-token-key" : {
          "type" : "fixed",
          "size" : 1,
          "queue_size" : 1000
        },
        "refresh" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 8,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "system_write" : {
          "type" : "fixed",
          "size" : 5,
          "queue_size" : 1000
        },
        "generic" : {
          "type" : "scaling",
          "core" : 4,
          "max" : 128,
          "keep_alive" : "30s",
          "queue_size" : -1
        },
        "warmer" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "management" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "analyze" : {
          "type" : "fixed",
          "size" : 1,
          "queue_size" : 16
        },
        "searchable_snapshots_cache_prewarming" : {
          "type" : "scaling",
          "core" : 0,
          "max" : 32,
          "keep_alive" : "30s",
          "queue_size" : -1
        },
        "ml_job_comms" : {
          "type" : "scaling",
          "core" : 4,
          "max" : 2048,
          "keep_alive" : "1m",
          "queue_size" : -1
        },
        "snapshot" : {
          "type" : "scaling",
          "core" : 1,
          "max" : 5,
          "keep_alive" : "5m",
          "queue_size" : -1
        },
        "search_throttled" : {
          "type" : "fixed_auto_queue_size",
          "size" : 1,
          "queue_size" : 100
        }
      },
      "transport" : {
        "bound_address" : [
          "[::1]:9300",
          "127.0.0.1:9300"
        ],
        "publish_address" : "127.0.0.1:9300",
        "profiles" : { }
      },
      "http" : {
        "bound_address" : [
          "[::1]:9200",
          "127.0.0.1:9200"
        ],
        "publish_address" : "127.0.0.1:9200",
        "max_content_length_in_bytes" : 104857600
      },
      "plugins" : [ ],
      "modules" : [
        {
          "name" : "aggs-matrix-stats",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Adds aggregations whose input are a list of numeric fields and output includes a matrix.",
          "classname" : "org.elasticsearch.search.aggregations.matrix.MatrixAggregationPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "analysis-common",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Adds \"built in\" analyzers to Elasticsearch.",
          "classname" : "org.elasticsearch.analysis.common.CommonAnalysisPlugin",
          "extended_plugins" : [
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "constant-keyword",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for the constant-keyword field type, which is a specialization of keyword for the case when all documents have the same value.",
          "classname" : "org.elasticsearch.xpack.constantkeyword.ConstantKeywordMapperPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "flattened",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for the flattened field type, which allows JSON objects to be flattened into a single field.",
          "classname" : "org.elasticsearch.xpack.flattened.FlattenedMapperPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "frozen-indices",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for the frozen indices functionality",
          "classname" : "org.elasticsearch.xpack.frozen.FrozenIndices",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "ingest-common",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for ingest processors that do not require additional security permissions or have large dependencies and resources",
          "classname" : "org.elasticsearch.ingest.common.IngestCommonPlugin",
          "extended_plugins" : [
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "ingest-geoip",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Ingest processor that uses looksup geo data based on ip adresses using the Maxmind geo database",
          "classname" : "org.elasticsearch.ingest.geoip.IngestGeoIpPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "ingest-user-agent",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Ingest processor that extracts information from a user agent",
          "classname" : "org.elasticsearch.ingest.useragent.IngestUserAgentPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "kibana",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Plugin exposing APIs for Kibana system indices",
          "classname" : "org.elasticsearch.kibana.KibanaPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "lang-expression",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Lucene expressions integration for Elasticsearch",
          "classname" : "org.elasticsearch.script.expression.ExpressionPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "lang-mustache",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Mustache scripting integration for Elasticsearch",
          "classname" : "org.elasticsearch.script.mustache.MustachePlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "lang-painless",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "An easy, safe and fast scripting language for Elasticsearch",
          "classname" : "org.elasticsearch.painless.PainlessPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "mapper-extras",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Adds advanced field mappers",
          "classname" : "org.elasticsearch.index.mapper.MapperExtrasPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "mapper-version",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for a field type to store sofware versions",
          "classname" : "org.elasticsearch.xpack.versionfield.VersionFieldPlugin",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "parent-join",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "This module adds the support parent-child queries and aggregations",
          "classname" : "org.elasticsearch.join.ParentJoinPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "percolator",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Percolator module adds capability to index queries and query these queries by specifying documents",
          "classname" : "org.elasticsearch.percolator.PercolatorPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "rank-eval",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Rank Eval module adds APIs to evaluate ranking quality.",
          "classname" : "org.elasticsearch.index.rankeval.RankEvalPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "reindex",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Reindex module adds APIs to reindex from one index to another or update documents in place.",
          "classname" : "org.elasticsearch.index.reindex.ReindexPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "repositories-metering-api",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Repositories metering API",
          "classname" : "org.elasticsearch.xpack.repositories.metering.RepositoriesMeteringPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "repository-url",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for URL repository",
          "classname" : "org.elasticsearch.plugin.repository.url.URLRepositoryPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "search-business-rules",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for applying business rules to search result rankings",
          "classname" : "org.elasticsearch.xpack.searchbusinessrules.SearchBusinessRules",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "searchable-snapshots",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for the searchable snapshots functionality",
          "classname" : "org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "spatial",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for Basic Spatial features",
          "classname" : "org.elasticsearch.xpack.spatial.SpatialPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "systemd",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Integrates Elasticsearch with systemd",
          "classname" : "org.elasticsearch.systemd.SystemdPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "transform",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin to transform data",
          "classname" : "org.elasticsearch.xpack.transform.Transform",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "transport-netty4",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Netty 4 based transport implementation",
          "classname" : "org.elasticsearch.transport.Netty4Plugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "unsigned-long",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Module for the unsigned long field type",
          "classname" : "org.elasticsearch.xpack.unsignedlong.UnsignedLongMapperPlugin",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "vectors",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for working with vectors",
          "classname" : "org.elasticsearch.xpack.vectors.Vectors",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "wildcard",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A plugin for a keyword field type with efficient wildcard search",
          "classname" : "org.elasticsearch.xpack.wildcard.Wildcard",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-analytics",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Analytics",
          "classname" : "org.elasticsearch.xpack.analytics.AnalyticsPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-async",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A module which handles common async operations",
          "classname" : "org.elasticsearch.xpack.async.AsyncResultsIndexPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-async-search",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "A module which allows to track the progress of a search asynchronously.",
          "classname" : "org.elasticsearch.xpack.search.AsyncSearch",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-autoscaling",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Autoscaling",
          "classname" : "org.elasticsearch.xpack.autoscaling.Autoscaling",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ccr",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - CCR",
          "classname" : "org.elasticsearch.xpack.ccr.Ccr",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-core",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Core",
          "classname" : "org.elasticsearch.xpack.core.XPackPlugin",
          "extended_plugins" : [ ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-data-streams",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Data Streams",
          "classname" : "org.elasticsearch.xpack.datastreams.DataStreamsPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-deprecation",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Deprecation",
          "classname" : "org.elasticsearch.xpack.deprecation.Deprecation",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-enrich",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Enrich",
          "classname" : "org.elasticsearch.xpack.enrich.EnrichPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-eql",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Elasticsearch plugin that powers EQL for Elasticsearch",
          "classname" : "org.elasticsearch.xpack.eql.plugin.EqlPlugin",
          "extended_plugins" : [
            "x-pack-ql",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-graph",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Graph",
          "classname" : "org.elasticsearch.xpack.graph.Graph",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-identity-provider",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Identity Provider",
          "classname" : "org.elasticsearch.xpack.idp.IdentityProviderPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ilm",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Index Lifecycle Management",
          "classname" : "org.elasticsearch.xpack.ilm.IndexLifecycle",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-logstash",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Logstash",
          "classname" : "org.elasticsearch.xpack.logstash.Logstash",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ml",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Machine Learning",
          "classname" : "org.elasticsearch.xpack.ml.MachineLearning",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : true
        },
        {
          "name" : "x-pack-monitoring",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Monitoring",
          "classname" : "org.elasticsearch.xpack.monitoring.Monitoring",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-ql",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch infrastructure plugin for EQL and SQL for Elasticsearch",
          "classname" : "org.elasticsearch.xpack.ql.plugin.QlPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-rollup",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Rollup",
          "classname" : "org.elasticsearch.xpack.rollup.Rollup",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-security",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Security",
          "classname" : "org.elasticsearch.xpack.security.Security",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-sql",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "The Elasticsearch plugin that powers SQL for Elasticsearch",
          "classname" : "org.elasticsearch.xpack.sql.plugin.SqlPlugin",
          "extended_plugins" : [
            "x-pack-ql",
            "lang-painless"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-stack",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Stack",
          "classname" : "org.elasticsearch.xpack.stack.StackPlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-voting-only-node",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Voting-only node",
          "classname" : "org.elasticsearch.cluster.coordination.VotingOnlyNodePlugin",
          "extended_plugins" : [
            "x-pack-core"
          ],
          "has_native_controller" : false
        },
        {
          "name" : "x-pack-watcher",
          "version" : "7.10.2",
          "elasticsearch_version" : "7.10.2",
          "java_version" : "1.8",
          "description" : "Elasticsearch Expanded Pack Plugin - Watcher",
          "classname" : "org.elasticsearch.xpack.watcher.Watcher",
          "extended_plugins" : [
            "x-pack-core",
            "lang-painless"
          ],
          "has_native_controller" : false
        }
      ],
      "ingest" : {
        "processors" : [
          {
            "type" : "append"
          },
          {
            "type" : "bytes"
          },
          {
            "type" : "circle"
          },
          {
            "type" : "convert"
          },
          {
            "type" : "csv"
          },
          {
            "type" : "date"
          },
          {
            "type" : "date_index_name"
          },
          {
            "type" : "dissect"
          },
          {
            "type" : "dot_expander"
          },
          {
            "type" : "drop"
          },
          {
            "type" : "enrich"
          },
          {
            "type" : "fail"
          },
          {
            "type" : "foreach"
          },
          {
            "type" : "geoip"
          },
          {
            "type" : "grok"
          },
          {
            "type" : "gsub"
          },
          {
            "type" : "html_strip"
          },
          {
            "type" : "inference"
          },
          {
            "type" : "join"
          },
          {
            "type" : "json"
          },
          {
            "type" : "kv"
          },
          {
            "type" : "lowercase"
          },
          {
            "type" : "pipeline"
          },
          {
            "type" : "remove"
          },
          {
            "type" : "rename"
          },
          {
            "type" : "script"
          },
          {
            "type" : "set"
          },
          {
            "type" : "set_security_user"
          },
          {
            "type" : "sort"
          },
          {
            "type" : "split"
          },
          {
            "type" : "trim"
          },
          {
            "type" : "uppercase"
          },
          {
            "type" : "urldecode"
          },
          {
            "type" : "user_agent"
          }
        ]
      },
      "aggregations" : {
        "adjacency_matrix" : {
          "types" : [
            "other"
          ]
        },
        "auto_date_histogram" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "avg" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "boxplot" : {
          "types" : [
            "histogram",
            "numeric"
          ]
        },
        "cardinality" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "geopoint",
            "geoshape",
            "ip",
            "numeric",
            "range"
          ]
        },
        "children" : {
          "types" : [
            "other"
          ]
        },
        "composite" : {
          "types" : [
            "other"
          ]
        },
        "date_histogram" : {
          "types" : [
            "boolean",
            "date",
            "numeric",
            "range"
          ]
        },
        "date_range" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "diversified_sampler" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "numeric"
          ]
        },
        "extended_stats" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "filter" : {
          "types" : [
            "other"
          ]
        },
        "filters" : {
          "types" : [
            "other"
          ]
        },
        "geo_bounds" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "geo_centroid" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "geo_distance" : {
          "types" : [
            "geopoint"
          ]
        },
        "geohash_grid" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "geotile_grid" : {
          "types" : [
            "geopoint",
            "geoshape"
          ]
        },
        "global" : {
          "types" : [
            "other"
          ]
        },
        "histogram" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric",
            "range"
          ]
        },
        "ip_range" : {
          "types" : [
            "ip"
          ]
        },
        "matrix_stats" : {
          "types" : [
            "other"
          ]
        },
        "max" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "median_absolute_deviation" : {
          "types" : [
            "numeric"
          ]
        },
        "min" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "missing" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "geopoint",
            "ip",
            "numeric",
            "range"
          ]
        },
        "nested" : {
          "types" : [
            "other"
          ]
        },
        "parent" : {
          "types" : [
            "other"
          ]
        },
        "percentile_ranks" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "percentiles" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "range" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "rare_terms" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "ip",
            "numeric"
          ]
        },
        "rate" : {
          "types" : [
            "boolean",
            "numeric"
          ]
        },
        "reverse_nested" : {
          "types" : [
            "other"
          ]
        },
        "sampler" : {
          "types" : [
            "other"
          ]
        },
        "scripted_metric" : {
          "types" : [
            "other"
          ]
        },
        "significant_terms" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "ip",
            "numeric"
          ]
        },
        "significant_text" : {
          "types" : [
            "other"
          ]
        },
        "stats" : {
          "types" : [
            "boolean",
            "date",
            "numeric"
          ]
        },
        "string_stats" : {
          "types" : [
            "bytes"
          ]
        },
        "sum" : {
          "types" : [
            "boolean",
            "date",
            "histogram",
            "numeric"
          ]
        },
        "t_test" : {
          "types" : [
            "numeric"
          ]
        },
        "terms" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "ip",
            "numeric"
          ]
        },
        "top_hits" : {
          "types" : [
            "other"
          ]
        },
        "top_metrics" : {
          "types" : [
            "other"
          ]
        },
        "value_count" : {
          "types" : [
            "boolean",
            "bytes",
            "date",
            "geopoint",
            "geoshape",
            "histogram",
            "ip",
            "numeric",
            "range"
          ]
        },
        "variable_width_histogram" : {
          "types" : [
            "numeric"
          ]
        },
        "weighted_avg" : {
          "types" : [
            "numeric"
          ]
        }
      }
    }
  }
}

Tweaks

Elas­tic­search could use huge amount of RAM. But, I've test­ed it for thin in­stances it work even with on­ly 128m. The main con­fig­u­ra­tion files are lo­cat­ed in­to the di­rec­to­ry /​​​etc/​​​elasticsearch/​​​. You can tweak the amount of Ram in use by tweak­ing the rel­e­vant lines in the file jvm.options. Note Xms and Xmx val­ues must be equal.

sudo nano /etc/elasticsearch/jvm.options
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

#-Xms512m
#-Xmx512m
-Xms4g
-Xmx4g

Add restart al­ways di­rec­tive to the Elasticsearch's sys­temd unit.

sudo systemctl edit elasticsearch.service
[Service]
# SZS/MLT Tweak
Restart=always
RestartSec=3

To ap­ply the changes use the fol­low­ing com­mands.

sudo systemctl daemon-reload
sudo systemctl restart elasticsearch.service
systemctl status elasticsearch.service
systemctl cat elasticsearch.service

Me­di­aWi­ki Set­up

The main pur­pose of this guide is how to set­up Elas­tic­search to be used by MediaWiki's ex­ten­sion Cir­rusSearch, so in this sec­tion we will de­scribe how to do that. In ad­di­tion al­so the ex­ten­sion Ad­vanced­Search will be in­stalled and con­fig­ured.

If you have in­stalled the ex­ten­sion PdfHan­dler (or some oth­er file han­dling ex­ten­sion) Cir­rusSearch will show re­sults from the files con­tent – in the con­fig­u­ra­tion be­low is shown how to boost these re­sults. How to con­fig­ure ex­ten­sion Trans­late to use Elas­tic­search is de­cried in the MediaWiki's doc­u­men­ta­tion in the ar­ti­cle Trans­la­tion mem­o­ries.

In­stall the Ex­ten­sions Bun­dle

First of all you need to in­stall the ex­ten­sions with­in the MediaWiki's doc­u­ment root. In the fol­low­ing ex­am­ple is used the ap­proach Down­load from Git.

IP="/var/www/wiki.example.com" # The DocumentRoot directory of the wiki
OWNER="www-data"               # The user that owns the $IP directory
BRANCH="REL1_39"               # The MediaWiki's branch in use
cd "$IP/extensions"
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/AdvancedSearch --branch ${BRANCH}
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/Elastica --branch ${BRANCH}
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/CirrusSearch --branch ${BRANCH}
sudo chown -R ${Owner}:${Owner} Elastica/ CirrusSearch/
for ext in Elastica CirrusSearch; do sudo -u ${Owner} composer install --no-dev; done

LocalSettings.php Con­fig­u­ra­tion

Open the con­fig­u­ra­tion file with your fa­vorite ed­i­tor and place the fol­low­ing lines at suit­able place (the end of the file is good place). In the ex­am­ple be­low is shown the cur­rent con­fig­u­ra­tion of this wi­ki. Af­ter the build­ing of the search in­dex (next sec­tion) Cir­rusSearch should work with­out the ad­vanced set­up. More op­tions are de­scribed in the Extension:CirrusSearch page, al­so some un­doc­u­ment­ed op­tions could be found with­in its extension.json file.

sudo nano "$IP/LocalSettings.php"
## Extension:AdvancedSearch
wfLoadExtension( 'AdvancedSearch' );
$wgAdvancedSearchDeepcatEnabled = false;    // https://www.mediawiki.org/wiki/Topic:Uw036nwsilvb6w3t
$wgAdvancedSearchBetaFeature = false;       // (enable it by default) https://m.mediawiki.org/wiki/Topic:Upflskaswcvrunka
$wgAdvancedSearchHighlighting = true;       // https://www.mediawiki.org/wiki/Manual:Configuration_settings_(alphabetical)
$wgOpenSearchDescriptionLength = 2500;      // https://www.mediawiki.org/wiki/Manual:$wgOpenSearchDescriptionLength

## Extension:Elastica
wfLoadExtension( 'Elastica' );

## Extension:CirrusSearch
wfLoadExtension( 'CirrusSearch' );
// $wgDisableSearchUpdate = true;
$wgSearchType = 'CirrusSearch';
$wgDebugLogGroups['CirrusSearch'] = "$IP/cache/CirrusSearch.log";
// $wgCirrusSearchIndexBaseName = 'wiki_db_name';       // https://www.mediawiki.org/wiki/Extension:CirrusSearch#Configuration
// $wgCirrusSearchServers = [ '10.120.201.1' ];         // The address of the Elasticsearch serer if it is not available at 'localhost'

## Extension:CirrusSearch Advanced Setup
$wgCirrusSearchRescoreProfile = 'classic_noboostlinks';
// $wgCirrusSearchFullTextQueryBuilderProfiles = 'perfield_builder';
// $wgCirrusSearchCompletionProfiles = 'normal';
$wgCirrusSearchPhraseSuggestUseText = true;
$wgCirrusSearchCompletionSuggesterHardLimit = 200; // 50
$wgCirrusSearchFragmentSize = 200;
// $wgCirrusExploreSimilarResults = true;

// Give much weight to the "file_text" in order to show 
// results from the PDFs content. This requires PdfHandler
$wgCirrusSearchWeights = [
    "title" => 20,
    "redirect" => 15,
    "category" => 8,
    "heading" => 5,
    "opening_text" => 3,
    "text" => 5,
    "auxiliary_text" => 15,
    "file_text" => 25
];

// https://www.mediawiki.org/wiki/Help:Namespaces#Localisation
$wgCirrusSearchNamespaceWeights = [
    "2" => 0.05,
    "4" => 0.3,
    "6" => 0.2,
    "8" => 0.05,
    "10" => 0.005,
    "12" => 0.2,
    "14" => 0.1
];

Build Search In­dex

How to build and up­date the CirrusSearch/​​​Elasticsearc in­dex is well de­scribed in the doc­u­ments README and UP­GRADE which comes with the ex­ten­sion. Here the im­por­tant steps re­lat­ed to the in­dex build­ing for a first time are ex­tract­ed and con­vert­ed to a script – for a sin­gle wi­ki and for a wi­ki fam­i­ly.

sudo nano /usr/local/bin/"mlw-cirrussearch-elasticsearch-build-index-single-wiki.sh"
#!/bin/bash

# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license   https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @source    https://phabricator.wikimedia.org/source/extension-cirrussearch/browse/master/README
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install   Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-cirrussearch-elasticsearch-build-index-single-wiki.sh
#
# @desc      Create elastic search index for a singel MediaWiki

: ${IP:="/var/www/wiki.example.com"} # The DocumentRoot directory of the wiki
: ${OWNER:="www-data"}               # The user that owns the $IP directory

CS_MAINT_DIR="${IP}/extensions/CirrusSearch/maintenance"

## STEP 1
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Disable CirrusSearch amd Search update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgSearchType#// $wgSearchType#' "${IP}/LocalSettings.php"
sudo -u "$OWNER" sed -i 's#^// $wgDisableSearchUpdate#$wgDisableSearchUpdate#' "${IP}/LocalSettings.php"

printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5

## STEP 2
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Generate Elasticsearch Index" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSearchIndexConfig.php" --startOver --conf "${IP}/LocalSettings.php"

## STEP 3
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Search Update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgDisableSearchUpdate#// $wgDisableSearchUpdate#' "${IP}/LocalSettings.php"

printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5

## STEP 4
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Bootstrap the Search Index" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipLinks --indexOnSkip --conf "${IP}/LocalSettings.php"
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipParse --conf "${IP}/LocalSettings.php"

## STEP 5
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Cirrus Search" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^// $wgSearchType#$wgSearchType#' "${IP}/LocalSettings.php"

printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5

## Step 6
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Update Cirrus Search Suggestions (if the option is enabled in LocalSettings.php)" "${IP##*/}"
sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSuggesterIndex.php" --conf "${IP}/LocalSettings.php"

## Step 7 - this is the most time consumption step, you cold skip it and run it later...
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Run Jobs Quiue" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" /usr/bin/php "${IP}/maintenance/runJobs.php" --conf "${IP}/LocalSettings.php"
sudo nano /usr/local/bin/"mlw-cirrussearch-elasticsearch-build-index-wiki-family.sh"
#!/bin/bash

# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license   https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @source    https://phabricator.wikimedia.org/source/extension-cirrussearch/browse/master/README
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install   Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-cirrussearch-elasticsearch-build-index-wiki-family.sh
#
# @desc      Create elastic search index for a MediaWiki Family
#            Note: In this scenariou the family members share the same DocumentRoot
#			 and LocalSettings.php file (and Apache2 virtual host configuration).

: ${IP:="/var/www/wiki-family.example.com"} # The DocumentRoot directory of the wiki
: ${OWNER:="www-data"}                      # The user that owns the $IP directory
: ${WIKI_IDs:="bg" "en" "ru" "commons"}     # The user that owns the $IP directory

CS_MAINT_DIR="${IP}/extensions/CirrusSearch/maintenance"

## STEP 1
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Disable CirrusSearch amd Search update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgSearchType#// $wgSearchType#' "${IP}/LocalSettings.php"
sudo -u "$OWNER" sed -i 's#^// $wgDisableSearchUpdate#$wgDisableSearchUpdate#' "${IP}/LocalSettings.php"

printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5

## STEP 2
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Generate Elasticsearch Index" "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSearchIndexConfig.php" --startOver --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
done

## STEP 3
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Search Update" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^$wgDisableSearchUpdate#// $wgDisableSearchUpdate#' "${IP}/LocalSettings.php"

printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5

## STEP 4
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Bootstrap the Search Index" "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipLinks --indexOnSkip --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/ForceSearchIndex.php" --skipParse --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
    sleep 2
done

## STEP 5
printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Enable Cirrus Search" "${IP##*/}"
echo; sleep 5
sudo -u "$OWNER" sed -i 's#^// $wgSearchType#$wgSearchType#' "${IP}/LocalSettings.php"

printf -- '\n**\n%s\n*\n' "LocalSettings.php Audit"
sudo -u "$OWNER" grep '$wgSearchType\|$wgDisableSearchUpdate = true' "${IP}/LocalSettings.php"
echo; sleep 5

## Step 6
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Update Cirrus Search Suggestions (if the option is enabled in LocalSettings.php)" "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${CS_MAINT_DIR}/UpdateSuggesterIndex.php" --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
done

## Step 7 - this is the most time consumption step, you cold skip it and run it later...
for WIKI_ID in ${WIKI_IDs[@]}
do
    printf -- '\n**\n%s for Wiki:"%s" \n*\n\n' "Run Jobs Quiue"  "${IP##*/}::${WIKI_ID}"
    echo; sleep 5
    sudo -u "$OWNER" /usr/bin/php "${IP}/maintenance/runJobs.php" --wiki "${WIKI_ID}" #--conf "${IP}/LocalSettings.php"
done

Up­date

When you are up­date Cir­rusSearch or/​​​and Elas­tic­search there are sev­er­al pos­si­ble cas­es, which are well de­scribed in the CirrsusSearch's Doc­u­men­ta­tion files README and UP­GRADE.

One pos­si­ble way is to use the ex­ten­sions main­te­nance script as it is show be­low, or you can re­build the en­tire search in­dex as it is shown above :)

cd "$IP/extensions/CirrusSearch"
sudo -u $OWNER php maintenance/UpdateSearchIndexConfig.php --reindexAndRemoveOk --indexIdentifier now
sudo -u $OWNER php maintenance/Metastore.php --upgrade

To mon­i­tor the filesys­tem changes dur­ing the up­date you can use a com­mand like the fol­low­ing.

sudo watch "du -hs /var/lib/mysql/wiki_id && \
du -hs /var/lib/elasticsearch && \
php /var/www/wiki.metalevel.tech/maintenance/showJobs.php --type cirrusSearchElasticaWrite"

Ad­di­tion­al Set­up

Ac­cess Elas­tic­search via SSH Tun­nel

Us­ing such ap­proach is suit­able on­ly for test pur­pose, here is a man­u­al how to set-up:

Elas­tic­search Watch Scripts

Here are two ex­am­ple scripts that cov­er the fol­low­ing sce­nar­ios: [1] When the Elas­tic­search ser­vice is used on the same in­stance where it is used; and [2] When the Elas­tic­search ser­vice is used on an­oth­er host (in­stance) and we must be sure it is avail­able there.

sudo nano /usr/local/bin/"mlw-elasticsearch-watch-local.sh"
#!/bin/bash -e

# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license   https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install   Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-elasticsearch-watch-local.sh
#
# @desc      Test wheather Elasticsearch is accessible, if not attempt to restar and send email notification

: ${EMAIL_SENDER:="admin@example.com"}                 # The email address of the reponsible person
: ${EMAIL_ADMIN:="admin@example.com"}                  # The email address which sends the email
: ${EMAIL_BODY:="/tmp/elasticsearch-watch.email.body"} # Temporary file where the email body will be stored

if /usr/bin/curl 'http://127.0.0.1:9200' 2>&1 | /bin/grep -q 'Connection refused'
then
    {
        /bin/date
        echo
        echo "ElasticSearch fail and will be restarted..."
        /usr/bin/systemctl start elasticsearch.service
        /usr/bin/systemctl restart elasticsearch.service
    } > "$EMAIL_BODY" 2>&1

    /usr/bin/mail   -r "ElasticSearch Watch ${EMAIL_ADMIN}" \
                    -s "ElasticSearch was Restarted" "${EMAIL_SENDER}" \
                    -a "MIME-Version: 1.0" -a "Content-Type: text/html; charset=UTF-8" < "$EMAIL_BODY"
fi
sudo nano /usr/local/bin/"mlw-elasticsearch-watch-remote.sh"
#!/bin/bash -e

# @author    Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2021 Spas Z. Spasov
# @license   https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @home      https://wiki.metalevel.tech/wiki/Elasticsearch_and_MediaWiki_CirrusSearch
# @install   Create executable file within and place this code as its content:
#            /usr/local/bin/mlw-elasticsearch-watch-remote.sh
#
# @desc      Test wheather Elasticsearch is accessible, if not attempt to restar and send email notification
#            Here the test is done via SSH login to a remote instance where Elasticsearch is used

: ${EMAIL_SENDER:="admin@example.com"}                 # The email address of the reponsible person
: ${EMAIL_ADMIN:="admin@example.com"}                  # The email address which sends the email
: ${EMAIL_BODY:="/tmp/elasticsearch-watch.email.body"} # Temporary file where the email body will be stored
: ${HOSTNAME:="example.com"}                           # A hostname defined in the ssh/config file

if /usr/bin/ssh "$HOSTNAME" "curl 'http://127.0.0.1:9200' 2>&1" | /bin/grep -q 'Connection refused'
then
    {
        /bin/date
        echo
        echo "ElasticSearch on remote instance - ${HOSTNAME}, and will be restarted..."
        /usr/bin/systemctl start autossh-port-forward.service
        /usr/bin/systemctl start elasticsearch.service
        /usr/bin/systemctl restart autossh-trivictoria.service
        /usr/bin/systemctl restart elasticsearch.service
    } > "$EMAIL_BODY" 2>&1

    /usr/bin/mail   -r "ElasticSearch Watch ${EMAIL_ADMIN}" \
                    -s "ElasticSearch was Restarted" "${EMAIL_SENDER}" \
                    -a "MIME-Version: 1.0" -a "Content-Type: text/html; charset=UTF-8" < "$EMAIL_BODY"
fi

To run the test pe­ri­od­i­cal­ly you can cre­ate sim­ple sys­temd ser­vice and timer. A pret­ty sim­ple ex­am­ple of this ap­proach could be found with­in the Gunicorn's doc­u­men­ta­tion. An­oth­er way is to cre­ate a crontab en­try as the fol­low.

sudo crontab -e
# ElasticSearch Watch
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh

Dis­able Cir­rusSearch via PHP if Elas­tic­search is not avail­able

An­oth­er thing that could be done in or­der to be sure you wi­ki works cor­rect is to check whether Elas­tic­search is avail­able with­in LocalSettings.php. This could be done by an im­ple­men­ta­tion of the fol­low­ing code.

<?php
function isPortOpen($ipAddress, $portToCheck) {
    $fp = @fsockopen($ipAddress, $portToCheck, $errno, $errstr, 0.1);
    if (!$fp) {
        return false;
    } else {
        fclose($fp);
        return true;
    }
}

if (isPortOpen('127.0.0.1', 9300)) {
    echo '9300 Open';
} else {
    echo '9300 Closed';
}
$wgSearchType = 'CirrusSearch';

The LocalSettings.php im­ple­men­ta­tion could look like:

if (@fsockopen('127.0.0.1', 9300, $errno, $errstr, 0.1)) {
fclose($fp);
$wgSearchType = 'CirrusSearch';
}

Ref­er­ences