MediaWiki Extension CirrusSearch and Elasticsearch Setup: Difference between revisions
Line 64: | Line 64: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
{{collapse/div|#Versions}} | {{collapse/div|#Versions}} | ||
<syntaxhighlight lang="shell" line="1" class="mlw-shell-gray"> | <syntaxhighlight lang="shell" line="1" class="mlw-shell-gray mlw-collapsed-first-element"> | ||
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb | wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb | ||
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb | wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb |
Revision as of 11:03, 30 August 2022
This is a short manual how to set-up Elasticsearch to be used with the MediaWiki's extension CirrusSearch. You should choice an appropriate Elasticsearch version depending on your MediaWiki version. Currently I'm using MediaWiki 1.38 and it is recommended to use Elasticsearch 6.8.23+ with it. This version runs well over openjdk-11
which is the default Java version on Ubuntu Server 22.04.
Setup Java and Javac
On Ubuntu Server the default jdk
and jre
packages can be installed by the following command.
sudo apt install -y apt-transport-https default-jdk default-jre
To check and switch the current version of Java and Javac you can use the following commands.
sudo update-alternatives --config java
There are 2 choices for the alternative java (providing /usr/bin/java).
Selection Path Priority Status
------------------------------------------------------------
0 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 auto mode
* 1 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 manual mode
2 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java 1081 manual mode
Press <enter> to keep the current choice[*], or type selection number: 1
sudo update-alternatives --config javac
There are 2 choices for the alternative javac (providing /usr/bin/javac).
Selection Path Priority Status
------------------------------------------------------------
0 /usr/lib/jvm/java-11-openjdk-amd64/bin/javac 1111 auto mode
* 1 /usr/lib/jvm/java-11-openjdk-amd64/bin/javac 1111 manual mode
2 /usr/lib/jvm/java-8-openjdk-amd64/bin/javac 1081 manual mode
Press <enter> to keep the current choice[*], or type selection number: 1
If you are using Elasticsearch 5.x it requires openjdk‑8
which can be installed by the following commands. After the installation use the above commands to switch the version in use.
sudo apt install openjdk-8-jre-headless
sudo apt install openjdk-8-jdk-headless
After switching the version of Java you need to restart the Elasticsearch service if it is already installed.
sudo systemctl restart elasticsearch.service
curl 'http://127.0.0.1:9200' # do a test
Elasticsearch
There is a couple of ways how to Installing Elasticsearch – via Docker, via Apt repository, via .deb or .rpm packages, etc. I prefer to manually download and install it via .deb package. Is I said before for MediaWiki 1.38 we need version 6.8.23+.
cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
sudo apt install ./elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
ElasicSearch 6.5.4
, който е инсталиран работи с openjdk-11
, която е версията по подразбиране в Ubuntu 20.04. При условие, че възникнат проблеми със стабилността, ще се наложи преминаване към ElasicSearch 5.6.16
, която работи с openjdk‑8
. За подробности: виж тук.
cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
sudo apt install ./elasticsearch-6.5.4.deb
# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb
sudo systemctl enable elasticsearch.service
sudo systemctl start elasticsearch.service
sudo systemctl status elasticsearch.service
Намаляване на разрешената памет. Дори и тази ми се вижда много в bg.trivictoria.org e 128m
. Ако има много едновременни заявки може да надхвърли наличната памет и да се счупи. От друга срана ако е много малко пак се чупи.
sudo nano /etc/elasticsearch/jvm.options
#-Xms1g
#-Xmx1g
-Xms512m
-Xmx512m
Добавяне на директиви за автоматично рестартиране в system.d unit‑а.
sudo cp /lib/systemd/system/elasticsearch.service ~/Downloads/elasticsearch.service.default
sudo nano /lib/systemd/system/elasticsearch.service
[Service]
# В края на секцията
Restart=always
RestartSec=3
Прилагане на промените.
sudo systemctl daemon-reload
sudo systemctl restart elasticsearch.service
sudo systemctl status elasticsearch.service
Проверка.
curl 'http://127.0.0.1:9200'
{
"name" : "HFrziWt",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "5qMVv8CHT3q2vv1sd8hLOw",
"version" : {
"number" : "6.5.4",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "d2ef93d",
"build_date" : "2018-12-17T21:17:40.758843Z",
"build_snapshot" : false,
"lucene_version" : "7.5.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
За да започне регулярно индексиране на съдържанието на уикито, спрямо конфигурацията, направена в /var/www/*/LocalSettings.php
и документацията на mw:Extension:CirrusSearch трябва да направи първоначална индексация, да се изпълнят задачите, които ще създаде тя, да се регенерира индекса на съдържанието и отново да се изпразни опашката със задачите. За целта могат да се използват скриптовете за поддръжка, описани в секцията MediaWiki.
mw-maintenance-elasticsearch-index.sh
mw-maintenance-runJobs.sh cli
mw-maintenance-rebuildAll.sh
mw-maintenance-runJobs.sh cli
В допълнение е разработен скрипта elasticsearch-watch.sh
, като чрез crontab
задача се прави периодична проверка и при необходимост рестартиране. Скрипта изпраща писмо до vectoria@altclavis.com
, ако настъпи събитие.
sudo crontab -e
# ElasticSearch Watch
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh
References
- BitLaunch: How to install Elasticsearch on Ubuntu 20.04 LTS
- Computing for Geeks: Install Elasticsearch 6.x on Ubuntu 18.04 LTS
- Media Wiki: Extension:CirrusSearch
- Phabricator: Extension:CirrusSearch
- Media Wiki: CirrusSearch Talk – Java version compatibility
- Mincong's blog: GC in Elasticsearch – Basic information about garbage collection (GC) in Elasticsearch, JVM options, GC logging
- Foojay.io: Handling JDK & GC Options Dynamically in Elasticsearch
- Elasticsearch Documentation: Important Elasticsearch configuration
- Elasticsearch Documentation: GC logging
Access Elasticsearch via SSH
-REVIEW-
ElasticSearch
Required by MW:Extension:CirrusSearch, some other mw:extensions and some extensions of NextCloud.
~/Downloads/elastic-search/
sido apt install ./elasticsearch-5.6.16.deb
Добавяне на функция за автоматичен рестарт за elasticsearch.service
, тъй като по някой път се „чупи“:
sudo nano /lib/systemd/system/elasticsearch.service
[Service]
...
Restart=always
RestartSec=3
Java
Downgrade required by MW:Extension:CirrusSearch and ElasticSearch 5.6.16.