MediaWiki Extension CirrusSearch and Elasticsearch Setup: Difference between revisions
mNo edit summary |
|||
Line 1: | Line 1: | ||
<noinclude><!--[[Category:DevOps_and_SRE|?]]-->{{ContentArticleHeader/DevOps_and_SRE|toc=off}}{{ContentArticleHeader/Linux_Server}}</noinclude> | <noinclude><!--[[Category:DevOps_and_SRE|?]]-->{{ContentArticleHeader/DevOps_and_SRE|toc=off}}{{ContentArticleHeader/Linux_Server}}</noinclude> | ||
This is a short manual how to set-up [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/important-settings.html Elasticsearch] to be used with the MediaWiki's extension [[mw:Extension:CirrusSearch|CirrusSearch]]. You should choice an appropriate Elasticsearch version [[mw:Extension:CirrusSearch#Elasticsearch|depending]] on your MediaWiki version. Currently I'm using MediaWiki 1.38 and it is recommended to use Elasticsearch 6.8.23+ with it. This version runs well over <code>openjdk-11</code> which is the default Java version on Ubuntu Server 22.04. | |||
== ElasticSearch | == Setup Java and Javac == | ||
To check and switch the current version of Java and Javac you can use the following commands.<syntaxhighlight lang="shell" line="1"> | |||
sudo update-alternatives --config java | |||
</syntaxhighlight><syntaxhighlight lang="shell-session"> | |||
There are 2 choices for the alternative java (providing /usr/bin/java). | |||
Selection Path Priority Status | |||
------------------------------------------------------------ | |||
0 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 auto mode | |||
* 1 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 manual mode | |||
2 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java 1081 manual mode | |||
Press <enter> to keep the current choice[*], or type selection number: 1 | |||
</syntaxhighlight><syntaxhighlight lang="shell" line="1"> | |||
sudo update-alternatives --config javac | |||
</syntaxhighlight><syntaxhighlight lang="shell-session"> | |||
There are 2 choices for the alternative javac (providing /usr/bin/javac). | |||
Selection Path Priority Status | |||
------------------------------------------------------------ | |||
0 /usr/lib/jvm/java-11-openjdk-amd64/bin/javac 1111 auto mode | |||
* 1 /usr/lib/jvm/java-11-openjdk-amd64/bin/javac 1111 manual mode | |||
2 /usr/lib/jvm/java-8-openjdk-amd64/bin/javac 1081 manual mode | |||
Press <enter> to keep the current choice[*], or type selection number: 1 | |||
</syntaxhighlight>If you are using ElasticSearch 5.x it requires <syntaxhighlight lang="bash"> | |||
sudo apt install openjdk-8-jre-headless | |||
sudo apt install openjdk-8-jdk-headless | |||
sudo update-alternatives --config java | |||
sudo update-alternatives --config javac | |||
sudo systemctl restart elasticsearch.service | |||
curl 'http://127.0.0.1:9200' # do a test | |||
</syntaxhighlight> | |||
== ElasticSearch == | |||
<code>ElasicSearch 6.5.4</code>, който е инсталиран работи с <code>openjdk-11</code>, която е версията по подразбиране в Ubuntu 20.04. При условие, че възникнат проблеми със стабилността, ще се наложи преминаване към <code>ElasicSearch 5.6.16</code>, която работи с <code>openjdk-8</code>. За подробности: [[Mw:Topic:Vo4jribwur9xzm9z|'''виж''' '''тук''']].<syntaxhighlight lang="shell" line="1"> | <code>ElasicSearch 6.5.4</code>, който е инсталиран работи с <code>openjdk-11</code>, която е версията по подразбиране в Ubuntu 20.04. При условие, че възникнат проблеми със стабилността, ще се наложи преминаване към <code>ElasicSearch 5.6.16</code>, която работи с <code>openjdk-8</code>. За подробности: [[Mw:Topic:Vo4jribwur9xzm9z|'''виж''' '''тук''']].<syntaxhighlight lang="shell" line="1"> | ||
sudo apt install -y apt-transport-https default-jdk default-jre | sudo apt install -y apt-transport-https default-jdk default-jre | ||
Line 8: | Line 42: | ||
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb | wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb | ||
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb | wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb | ||
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb | |||
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb | |||
sudo apt install ./elasticsearch-6.5.4.deb | sudo apt install ./elasticsearch-6.5.4.deb | ||
</syntaxhighlight><syntaxhighlight lang="bash"> | </syntaxhighlight><syntaxhighlight lang="bash"> | ||
Line 64: | Line 101: | ||
# ElasticSearch Watch | # ElasticSearch Watch | ||
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh | */5 * * * * /usr/local/bin/elasticsearch-watch.sh | ||
</syntaxhighlight> | </syntaxhighlight> | ||
* [https://bitlaunch.io/blog/install-elasticsearch-on-ubuntu-20-04-lts/ How to install Elasticsearch on Ubuntu 20.04 LTS] | == References == | ||
* [https://computingforgeeks.com/how-to-install-elasticsearch-6-x-on-ubuntu-18-04-lts-bionic-beaver-linux/ Install Elasticsearch 6.x on Ubuntu 18.04 LTS] | * BitLaunch: [https://bitlaunch.io/blog/install-elasticsearch-on-ubuntu-20-04-lts/ How to install Elasticsearch on Ubuntu 20.04 LTS] | ||
* [[Mw:Extension:CirrusSearch| | * Computing for Geeks: [https://computingforgeeks.com/how-to-install-elasticsearch-6-x-on-ubuntu-18-04-lts-bionic-beaver-linux/ Install Elasticsearch 6.x on Ubuntu 18.04 LTS] | ||
* Media Wiki: [[Mw:Extension:CirrusSearch|Extension:CirrusSearch]] | |||
* Phabricator: [https://phabricator.wikimedia.org/source/extension-cirrussearch/browse/master/README Extension:CirrusSearch] | |||
* Media Wiki: [[Mw:Topic:Vo4jribwur9xzm9z|CirrusSearch Talk - '''Java version compatibility''']] | |||
* Mincong's blog: [https://mincong.io/2020/08/30/gc-in-elasticsearch/ GC in Elasticsearch - Basic information about garbage collection (GC) in Elasticsearch, JVM options, GC logging] | |||
* Foojay.io: [https://foojay.io/today/handling-jdk-and-gc-options-dynamically-in-elasticsearch/ Handling JDK & GC Options Dynamically in Elasticsearch] | |||
* Elasticsearch Documentation: [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/important-settings.html '''Important Elasticsearch configuration'''] | |||
* Elasticsearch Documentation: [https://www.elastic.co/guide/en/elasticsearch/reference/6.8/gc-logging.html#gc-logging GC logging] | |||
== Access Elasticsearch via SSH == | == Access Elasticsearch via SSH == | ||
Line 92: | Line 136: | ||
=== Java === | === Java === | ||
Downgrade required by [[mediawikiwiki:Topic:Vo4jribwur9xzm9z|MW:Extension:CirrusSearch]] and ElasticSearch 5.6.16. | |||
Downgrade required by [[mediawikiwiki:Topic:Vo4jribwur9xzm9z|MW:Extension:CirrusSearch]] and ElasticSearch 5.6.16.<noinclude> | |||
<div id='devStage'> | <div id='devStage'> | ||
{{devStage | {{devStage |
Revision as of 10:33, 30 August 2022
This is a short manual how to set-up Elasticsearch to be used with the MediaWiki's extension CirrusSearch. You should choice an appropriate Elasticsearch version depending on your MediaWiki version. Currently I'm using MediaWiki 1.38 and it is recommended to use Elasticsearch 6.8.23+ with it. This version runs well over openjdk-11
which is the default Java version on Ubuntu Server 22.04.
Setup Java and Javac
To check and switch the current version of Java and Javac you can use the following commands.
sudo update-alternatives --config java
There are 2 choices for the alternative java (providing /usr/bin/java).
Selection Path Priority Status
------------------------------------------------------------
0 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 auto mode
* 1 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 manual mode
2 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java 1081 manual mode
Press <enter> to keep the current choice[*], or type selection number: 1
sudo update-alternatives --config javac
There are 2 choices for the alternative javac (providing /usr/bin/javac).
Selection Path Priority Status
------------------------------------------------------------
0 /usr/lib/jvm/java-11-openjdk-amd64/bin/javac 1111 auto mode
* 1 /usr/lib/jvm/java-11-openjdk-amd64/bin/javac 1111 manual mode
2 /usr/lib/jvm/java-8-openjdk-amd64/bin/javac 1081 manual mode
Press <enter> to keep the current choice[*], or type selection number: 1
If you are using ElasticSearch 5.x it requires
sudo apt install openjdk-8-jre-headless
sudo apt install openjdk-8-jdk-headless
sudo update-alternatives --config java
sudo update-alternatives --config javac
sudo systemctl restart elasticsearch.service
curl 'http://127.0.0.1:9200' # do a test
ElasticSearch
ElasicSearch 6.5.4
, който е инсталиран работи с openjdk-11
, която е версията по подразбиране в Ubuntu 20.04. При условие, че възникнат проблеми със стабилността, ще се наложи преминаване към ElasicSearch 5.6.16
, която работи с openjdk‑8
. За подробности: виж тук.
sudo apt install -y apt-transport-https default-jdk default-jre
cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
sudo apt install ./elasticsearch-6.5.4.deb
# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
sudo systemctl enable elasticsearch.service
sudo systemctl start elasticsearch.service
sudo systemctl status elasticsearch.service
Намаляване на разрешената памет. Дори и тази ми се вижда много в bg.trivictoria.org e 128m
. Ако има много едновременни заявки може да надхвърли наличната памет и да се счупи. От друга срана ако е много малко пак се чупи.
sudo nano /etc/elasticsearch/jvm.options
#-Xms1g
#-Xmx1g
-Xms512m
-Xmx512m
Добавяне на директиви за автоматично рестартиране в system.d unit‑а.
sudo cp /lib/systemd/system/elasticsearch.service ~/Downloads/elasticsearch.service.default
sudo nano /lib/systemd/system/elasticsearch.service
[Service]
# В края на секцията
Restart=always
RestartSec=3
Прилагане на промените.
sudo systemctl daemon-reload
sudo systemctl restart elasticsearch.service
sudo systemctl status elasticsearch.service
Проверка.
curl 'http://127.0.0.1:9200'
{
"name" : "HFrziWt",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "5qMVv8CHT3q2vv1sd8hLOw",
"version" : {
"number" : "6.5.4",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "d2ef93d",
"build_date" : "2018-12-17T21:17:40.758843Z",
"build_snapshot" : false,
"lucene_version" : "7.5.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
За да започне регулярно индексиране на съдържанието на уикито, спрямо конфигурацията, направена в /var/www/*/LocalSettings.php
и документацията на mw:Extension:CirrusSearch трябва да направи първоначална индексация, да се изпълнят задачите, които ще създаде тя, да се регенерира индекса на съдържанието и отново да се изпразни опашката със задачите. За целта могат да се използват скриптовете за поддръжка, описани в секцията MediaWiki.
mw-maintenance-elasticsearch-index.sh
mw-maintenance-runJobs.sh cli
mw-maintenance-rebuildAll.sh
mw-maintenance-runJobs.sh cli
В допълнение е разработен скрипта elasticsearch-watch.sh
, като чрез crontab
задача се прави периодична проверка и при необходимост рестартиране. Скрипта изпраща писмо до vectoria@altclavis.com
, ако настъпи събитие.
sudo crontab -e
# ElasticSearch Watch
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh
References
- BitLaunch: How to install Elasticsearch on Ubuntu 20.04 LTS
- Computing for Geeks: Install Elasticsearch 6.x on Ubuntu 18.04 LTS
- Media Wiki: Extension:CirrusSearch
- Phabricator: Extension:CirrusSearch
- Media Wiki: CirrusSearch Talk – Java version compatibility
- Mincong's blog: GC in Elasticsearch – Basic information about garbage collection (GC) in Elasticsearch, JVM options, GC logging
- Foojay.io: Handling JDK & GC Options Dynamically in Elasticsearch
- Elasticsearch Documentation: Important Elasticsearch configuration
- Elasticsearch Documentation: GC logging
Access Elasticsearch via SSH
-REVIEW-
ElasticSearch
Required by MW:Extension:CirrusSearch, some other mw:extensions and some extensions of NextCloud.
~/Downloads/elastic-search/
sido apt install ./elasticsearch-5.6.16.deb
Добавяне на функция за автоматичен рестарт за elasticsearch.service
, тъй като по някой път се „чупи“:
sudo nano /lib/systemd/system/elasticsearch.service
[Service]
...
Restart=always
RestartSec=3
Java
Downgrade required by MW:Extension:CirrusSearch and ElasticSearch 5.6.16.