MediaWiki Extension CirrusSearch and Elasticsearch Setup

From WikiMLT

This is a short man­u­al how to set-up Elas­tic­search to be used with the MediaWiki's ex­ten­sion Cir­rusSearch which com­mu­ni­cate to the ser­vice by the ex­ten­sion Elas­ti­ca. You should choice an ap­pro­pri­ate Elas­tic­search ver­sion de­pend­ing on your Me­di­aWi­ki ver­sion. Cur­rent­ly I'm us­ing Me­di­aWi­ki 1.38 and it is rec­om­mend­ed to use Elas­tic­search 6.8.23+ with it. This ver­sion runs well over open­jdk-11 which is the de­fault Ja­va ver­sion on Ubun­tu Serv­er 22.04.

Elas­tic­search and the ex­ten­sion Elas­ti­ca are re­quired by some oth­er Me­di­aWi­ki ex­ten­sions as ex­ten­sion Trans­late where it is used as trans­la­tion mem­o­ry. It is al­so used by the NextCoud's ap­pli­ca­tion Full text search and more…

Set­up Ja­va and Javac

On Ubun­tu Serv­er the de­fault jdk and jre pack­ages can be in­stalled by the fol­low­ing com­mand.

sudo apt install -y apt-transport-https default-jdk default-jre

To check and switch the cur­rent ver­sion of Ja­va and Javac you can use the fol­low­ing com­mands.

sudo update-alternatives --config java
#Out­put
There are 2 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                                            Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-11-openjdk-amd64/bin/java      1111      auto mode
* 1            /usr/lib/jvm/java-11-openjdk-amd64/bin/java      1111      manual mode
  2            /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java   1081      manual mode

Press <enter> to keep the current choice[*], or type selection number: 1
sudo update-alternatives --config javac
#Out­put
There are 2 choices for the alternative javac (providing /usr/bin/javac).

  Selection    Path                                          Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-11-openjdk-amd64/bin/javac   1111      auto mode
* 1            /usr/lib/jvm/java-11-openjdk-amd64/bin/javac   1111      manual mode
  2            /usr/lib/jvm/java-8-openjdk-amd64/bin/javac    1081      manual mode

Press <enter> to keep the current choice[*], or type selection number: 1

If you are us­ing Elas­tic­search 5.x it re­quires openjdk‑8 which can be in­stalled by the fol­low­ing com­mands. Af­ter the in­stal­la­tion use the above com­mands to switch the ver­sion in use.

#De­tails
sudo apt install openjdk-8-jre-headless 
sudo apt install openjdk-8-jdk-headless

Af­ter switch­ing the ver­sion of Ja­va you need to restart the Elas­tic­search ser­vice if it is al­ready in­stalled.

sudo systemctl restart elasticsearch.service 
curl 'http://127.0.0.1:9200' # do a test

Elas­tic­search

There is a cou­ple of ways how to In­stalling Elas­tic­search – via Dock­er, via Apt repos­i­to­ry, via .deb or .rpm pack­ages, etc. I pre­fer to man­u­al­ly down­load and in­stall it via .deb pack­age. Is I said be­fore for Me­di­aWi­ki 1.38 we need ver­sion 6.8.23+.

cd ~/Downloads
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.23.deb
sudo apt install ./elasticsearch-6.8.23.deb
#Ver­sions
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-amd64.deb

Af­ter in­stalling the pack­age the Elas­tic­search ser­vice must be en­abled and start­ed.

sudo systemctl enable --now elasticsearch.service
sudo systemctl status elasticsearch.service

You can check does the ser­vice work prop­er­ly by the fol­low­ing ap­proach.

curl 'http://127.0.0.1:9200'
#Out­put
{
  "name" : "W2uxKNc",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "JwkNoPi_THuiCA123-HKMg",
  "version" : {
    "number" : "6.8.23",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "4f67856",
    "build_date" : "2022-01-06T21:30:50.087716Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.3",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

На­ма­ля­ва­не на раз­ре­ше­на­та па­мет. До­ри и та­зи ми се виж­да мно­го в bg​.trivictoria​.org128m. Ако има мно­го ед­нов­ре­мен­ни за­яв­ки мо­же да над­хвър­ли на­лич­на­та па­мет и да се счу­пи. От дру­га сра­на ако е мно­го мал­ко пак се чу­пи.

sudo nano /etc/elasticsearch/jvm.options
#-Xms1g
#-Xmx1g
-Xms512m
-Xmx512m

До­ба­вя­не на ди­рек­ти­ви за ав­то­ма­тич­но рес­тар­ти­ра­не в system.d unit‑а.

sudo cp /lib/systemd/system/elasticsearch.service ~/Downloads/elasticsearch.service.default
sudo nano /lib/systemd/system/elasticsearch.service
[Service]
# В края на секцията
Restart=always
RestartSec=3

При­ла­га­не на про­ме­ни­те.

sudo systemctl daemon-reload
sudo systemctl restart elasticsearch.service
sudo systemctl status elasticsearch.service


За да за­поч­не ре­гу­ляр­но ин­дек­си­ра­не на съ­дър­жа­ни­е­то на уики­то, спря­мо кон­фи­гу­ра­ци­я­та, напра­ве­на в /var/­www/­*/­Local­Sett­ings.php и до­ку­мен­та­ци­я­та на mw:Extension­:­CirrusSearch тряб­ва да напра­ви пър­во­на­чал­на ин­дек­са­ция, да се из­пъл­нят за­да­чи­те, ко­и­то ще съз­да­де тя, да се ре­ге­не­ри­ра ин­дек­са на съ­дър­жа­ни­е­то и от­но­во да се из­праз­ни опаш­ка­та със за­да­чи­те. За цел­та мо­гат да се из­пол­з­ват скрип­то­ве­те за под­дръж­ка, опи­са­ни в сек­ци­я­та Me­di­aWi­ki.

mw-maintenance-elasticsearch-index.sh
mw-maintenance-runJobs.sh cli
mw-maintenance-rebuildAll.sh
mw-maintenance-runJobs.sh cli

В до­пъл­не­ние е раз­ра­бо­тен скрип­та elasticsearch​-watch​.sh, ка­то чрез crontab за­да­ча се пра­ви пе­ри­о­дич­на про­вер­ка и при не­об­хо­ди­мост рес­тар­ти­ра­не. Скрип­та из­пра­ща пис­мо до vectoria@altclavis.com, ако настъ­пи съ­би­тие.

sudo crontab -e
# ElasticSearch Watch
*/5 * * * * /usr/local/bin/elasticsearch-watch.sh

Ref­er­ences

Ac­cess Elas­tic­search via SSH


-RE­VIEW-

Elas­tic­Search

Re­quired by MW:Extension:CirrusSearch, some oth­er mw:extensions and some ex­ten­sions of NextCloud.

~/Downloads/elastic-search/
sido apt install ./elasticsearch-5.6.16.deb

До­ба­вя­не на фун­к­ция за ав­то­ма­ти­чен рес­тарт за elasticsearch.service, тъй ка­то по ня­кой път се „чу­пи“:

sudo nano /lib/systemd/system/elasticsearch.service

    [Service]
    ...
    Restart=always
    RestartSec=3

Ja­va

Down­grade re­quired by MW:Extension:CirrusSearch and Elas­tic­Search 5.6.16.