Linux I/O Monitoring and Analyze: Difference between revisions

From WikiMLT
Spas (talk | contribs)
Spas (talk | contribs)
Line 235: Line 235:
</syntaxhighlight>
</syntaxhighlight>
{{collapse/end}}
{{collapse/end}}
== Benchmark and Monitoring Tools ==
* <code>[https://manpages.ubuntu.com/manpages/jammy/en/man1/iozone.1.html iozone]</code> - filesystem benchmark - it is a filesystem benchmark tool. The benchmark generates and measures a variety of        file operations.  <code>iozone</code> has been ported to many machines and runs  under  many  operating      systems.  This document will cover the many different types of operations that are tested      as well as coverage of all of the command line options.
* <code>[https://manpages.ubuntu.com/manpages/jammy/en/man1/htop.1.html <nowiki>htop [-dCFhpustvH]</nowiki>]</code> - interactive process viewer - It  is  similar to <code>top</code>, but allows you to scroll vertically and horizontally, and interact      using a pointing device (mouse).  You can observe all processes  running  on  the  system,      along  with  their  command  line arguments, as well as view them in a tree format, select      multiple processes and acting on them all at once. Tasks related to processes (killing, renicing) can be done without entering their PIDs.
* <code>[https://manpages.ubuntu.com/manpages/kinetic/en/man1/top.1.html <nowiki>top [-hv|-bcEeHiOSs1 -d secs -n max -u|U user -p pids -o field -w [cols]]</nowiki>]</code> - display Linux processes - it  provides  a dynamic real-time view of a running system.  It can display  ''system'' summary information as well as a list  of  ''processes''  or  ''threads''  currently  being  managed by the Linux kernel.  The types of system summary information shown and the types,      order and size of information displayed for processes are all user configurable  and  that        configuration can be made persistent across restarts.
* <code>[https://manpages.ubuntu.com/manpages/jammy/en/man1/atop.1.html atop]</code> - Advanced System & Process Monitor - The program <code>atop</code> is an interactive monitor to view the load on a Linux system.  It  shows      the  occupation of the most critical hardware resources (from a performance point of view)      on system level, i.e. cpu, memory, disk and network.      It also shows which processes are responsible for the indicated load with respect  to  cpu      and  memory load on process level.  Disk load is shown per process if "storage accounting      is active in the kernel.  Network load is shown per process if the kernel module <code>netatop</code>      has been installed.


== References ==
== References ==

Revision as of 17:37, 31 August 2022

There is a cou­ple of tools avail­able that al­lows you to mon­i­tor and an­a­lyze the disk I/O per­for­mance of your Lin­ux dri­ven sys­tem. Here are list­ed few of them and al­so how to in­stall and ex­am­ples of their ba­sic us­age.

The htop com­mand

If a new­er ver­sion of htop is avail­able at your dis­tri­b­u­tion, there is avail­able an ad­di­tion­al tab that shows the I/O met­rics of the in­stance – Screen 1. Here is how to check the avail­able ver­sion and in­stall htop.

sudo apt show htop 2>/dev/null | grep '^Version'
sudo apt install htop

In­stall the lat­est ver­sion of htop 3.2.1–1 on Ubun­tu Serv­er 22.04.1 from a .deb pack­age.

cd /tmp
wget --no-check-certificate https://http.us.debian.org/debian/pool/main/h/htop/htop_3.2.1-1_amd64.deb
sudo apt install ./htop_3.2.1-1_amd64.deb

To be able to see all da­ta in most cas­es you need to run the tool as root:

sudo htop
Screen 1. The new I/O Met­rics tab of htop (v 3.2+). Use Tab to switch to the I/O tab, then use F6 to open the Sort by menu, and sort by IO_WRITE_RATE. The screen­shot is tak­en on Kali Lin­ux 2022. Screen 1. The new I/O Metrics tab of htop (v 3.2+). Use Tab to switch to the I/O tab, then use F6 to open the Sort by menu, and sort by IO_WRITE_RATE. The screenshot is taken on Kali Linux 2022.

The iostat com­mand

io­stat – Re­port Cen­tral Pro­cess­ing Unit (CPU) sta­tis­tics and input/​​​output sta­tis­tics for de­vices and par­ti­tions. The io­stat com­mand is used for mon­i­tor­ing sys­tem input/​​​output de­vice load­ing by ob­serv­ing the time the de­vices are ac­tive in re­la­tion to their av­er­age trans­fer rates…

The first re­port gen­er­at­ed by the io­stat com­mand pro­vides sta­tis­tics con­cern­ing the time since the sys­tem was boot­ed, un­less the -y op­tion is used. Each sub­se­quent re­port cov­ers the time since the pre­vi­ous re­port. All sta­tis­tics are re­port­ed each time the io­stat com­mand is run. The re­port con­sists of a CPU head­er row fol­lowed by a row of CPU sta­tis­tics. On mul­ti­proces­sor sys­tems, CPU sta­tis­tics are cal­cu­lat­ed sys­tem-wide as av­er­ages among all proces­sors. A de­vice head­er row is dis­played fol­lowed by a line of sta­tis­tics for each de­vice that is con­fig­ured…

Here is how to get the gen­er­al re­port in hu­man read­able for­mat.

iostat -h
Linux 5.18.0-kali5-amd64 (kali-x) 	08/31/2022 	_x86_64_	(24 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.4%    0.0%    0.2%    0.1%    0.0%   99.2%

      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
    11.50       224.2k        20.9k         0.0k       1.2G     119.2M       0.0k dm-0
    10.90       221.8k        20.9k         0.0k       1.2G     119.2M       0.0k dm-1
    10.99       315.7k        56.2k         0.0k       1.8G     320.0M       0.0k nvme0n1
     0.06         1.6k         0.0k         0.0k       8.9M       0.0k       0.0k sda
     4.14        55.1k         0.0k         0.0k     313.8M     152.0k       0.0k sdb

Here is how to get re­port per de­vice, per 1 minute, with time­stamp in hu­man read­able for­mat. Note the first re­port pro­vides sta­tis­tics con­cern­ing the time since the sys­tem was boot­ed, the lat­er re­ports pro­vide sta­tis­tic per 60 sec­onds.

iostat -h /dev/nvme0n1 -d 60 -t
Linux 5.18.0-kali5-amd64 (kali-x) 	08/31/2022 	_x86_64_	(24 CPU)

08/31/2022 05:39:22 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
    10.45       297.5k        53.6k         0.0k       1.8G     324.5M       0.0k nvme0n1

08/31/2022 05:40:22 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     1.10         0.6k        10.2k         0.0k      36.0k     612.0k       0.0k nvme0n1

Here is how to get the same as the above sta­tis­tic but con­cerned to a LVM log­i­cal vol­ume.

iostat -h /dev/mapper/kali--x--vg-home -d 60 -t
Linux 5.18.0-kali5-amd64 (kali-x) 	08/31/2022 	_x86_64_	(24 CPU)

08/31/2022 05:45:11 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     4.02        68.4k        19.7k         0.0k     437.2M     125.9M       0.0k dm-5

08/31/2022 05:46:11 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     0.30         0.0k         1.1k         0.0k       0.0k      68.0k       0.0k dm-5

Watch the changes in the full sta­tis­tics per half sec­ond. In the the fol­low­ing ex­am­ple:

  • io­stat ‑y ‑h – sup­press the re­port since the boot time, hu­man read­able for­mat,
  • io­stat 1 1[ in­ter­val [ count ] ] – one count per one sec­ond.
  • watch ‑n 0.5 ‑d – re­fresh per 0.5 sec­onds, show the dif­fer­ence
watch -n 0.5 -d "iostat -h -y 1 1"

Dis­play ex­tend­ed sta­tis­tics for the whole sys­tem, in hu­man read­able for­mat, with time­stamp, per 6 sec­onds.

iostat -x -h -t 6

The iotop com­mand

The dstat com­mand

dstat is a ver­sa­tile tool for gen­er­at­ing sys­tem re­source sta­tis­tics, it is a ver­sa­tile re­place­ment for vmstat, iostat and ifstat. Dstat is unique in let­ting you ag­gre­gate block de­vice through­put for a cer­tain diskset or net­work band­width for a group of in­ter­faces, ie. you can see the through­put for all the block de­vices that make up a sin­gle filesys­tem or stor­age sys­tem.

sudo apt install dstat

There is an un­count­able mul­ti­tude of op­tions and plu­g­ins avail­able for dstat. Here is one ex­am­ple of us­age – Screen 3 – where are ap­plied the fol­low­ing op­tions.

  • -D sdc – adds col­umn that re­ports the I/O rate of /dev/sdc.
  • -t, --time – en­able time/​​​date out­put.
  • -a, --all – equals to -cdngy (-c cpu, -d disk; -n en­able net­work stats; -g en­able page stats; -y en­able sys­tem stats).
  • --top-io – show most ex­pen­sive I/O process.
  • --top-bio – show most ex­pen­sive block I/O process.
  • --top-mem – show process us­ing the most mem­o­ry.
sudo dstat -D sda -ta --top-io --top-bio --top-mem
Screen 3. Example of usage of the dstat command.
Screen 3. Ex­am­ple of us­age of the dstat com­mand. Screen 3. Example of usage of the dstat command.

Here is an­oth­er ex­am­ple that will out­put the av­er­age I/O rate per minute.

dstat -tdD total 60
----system--------dsk/total--
     time      |  read  writ
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
29-08 08:40:13 | 1138M 1782M
29-08 08:41:13 |  234k  744k
29-08 08:42:13 |  293k  171k
29-08 08:43:13 |  268k  113k
29-08 08:44:13 | 1100k  129k

The vmstat com­mand

vmstat – vir­tu­al mem­o­ry sta­tis­tics – re­ports in­for­ma­tion about process­es, mem­o­ry, pag­ing, block IO, traps, disks and cpu ac­tiv­i­ty. The first re­port pro­duced gives av­er­ages since the last re­boot. Ad­di­tion­al re­ports give in­for­ma­tion on a sam­pling pe­ri­od of length de­lay. The process and mem­o­ry re­ports are in­stan­ta­neous in ei­ther case.

Here is how to get sta­tis­tics about the block de­vices – -d, in megabytes -Sm (1000000), or -SM (1048576) bytes.

sudo vmstat -d -Sm
disk-   ------------reads--------------- --------------writes--------------- ------IO-----
        total  merged   sec tors      ms    total  merged   sectors       ms    cur    sec
loop0      84       0      2554       81        0       0         0        0      0      0
loop1      84       0      2382       36        0       0         0        0      0      0
loop2      52       0       856       39        0       0         0        0      0      0
loop3      60       0       814       43        0       0         0        0      0      0
loop4      52       0       764       14        0       0         0        0      0      0
loop5     539       0     11200      240        0       0         0        0      0      0
loop6      87       0      2498      105        0       0         0        0      0      0
loop7     493       0     34854      228        0       0         0        0      0      1
sda   1284880  157343  65276608   752987  3936160 2077213 135566208  4487853      0   4123
sdc    437039  119521   4760810  2594193    96132  145500  54786232 10747453      0   3024
sdd   2614304  458294  24597746  6017154    63873 1360249  19094048  7394053      0   5798
sdb    136351    1445  34564266   329162    25383    2759  47585536  4447925      0    424
sr0       120       0      897        14        0       0         0        0      0      0
loop8      49       0      752        24        0       0         0        0      0      0
loop9      88       0     3334        42        0       0         0        0      0      0
loop10     11       0       28         0        0       0         0        0      0      0

The sar com­mand

The sar com­mand is part of the pack­age sysstat. It out­puts the con­tents of se­lect­ed cu­mu­la­tive ac­tiv­i­ty coun­ters in the  op­er­at­ing  sys­tem. The ac­tiv­i­ties are col­lect­ed by the sysstat.service. Af­ter in­stalling the pack­age we need to en­able the col­lec­tor ser­vice and wait un­til some sta­tis­tics are col­lect­ed.

sudo apt install sysstat
sudo sed -i 's/ENABLED="false"/ENABLED="true"/' /etc/default/sysstat
sudo systemctl enable --now sysstat.service
systemctl cat sysstat-collect.timer
# /lib/systemd/system/sysstat-collect.timer
# /lib/systemd/system/sysstat-collect.timer
# (C) 2014 Tomasz Torcz <tomek@pipebreaker.pl>
#
# sysstat-12.5.2 systemd unit file:
#        Activates activity collector every 10 minutes

[Unit]
Description=Run system activity accounting tool every 10 minutes

[Timer]
OnCalendar=*:00/10

[Install]
WantedBy=sysstat.service
sar
Linux 5.15.39-4-pve (ubuntu-lxc-pve) 	08/28/22 	_x86_64_	(24 CPU)

20:41:48     LINUX RESTART	(24 CPU)

20:50:05        CPU     %user     %nice   %system   %iowait    %steal     %idle
21:00:00        all      1.66      0.00      0.26      0.02      0.00     98.06
21:10:10        all      2.66      0.00      0.27      0.03      0.00     97.03
21:20:13        all      1.92      0.00      0.29      0.02      0.00     97.76
Average:        all      2.09      0.00      0.27      0.03      0.00     97.62

Mon­i­tor the Files Size Changes Re­cur­sive­ly

By the fol­low­ing com­mand we can mon­i­tor which are the most writ­ten files for the past 10 min­utes, larg­er than 800 Kb. This is done re­cur­sive­ly for the di­rec­to­ries /var/lib and /var/log. The out­put of the com­mand is shown at Screen 3.

sudo watch -n 3 -d \
"find /var/lib /var/log -type f -size +800k -mmin -10 -printf '%-30s \t %t %p\n' | grep -Pv '\.(gz|[0-9])$'"
Screen 3. Use watch and find to monitor file change in real time.
Screen 3. Use watch and find to mon­i­tor file change in re­al time. Screen 3. Use watch and find to monitor file change in real time.

Here is an ad­vanced ver­sion :) which out­puts al­so an ad­di­tion­al da­ta gen­er­at­ed by iostat:

sudo watch -n 3 -d \
"find /var/lib /var/log -type f -size +800k -mmin -10 -printf '%-30s \t %t %p\n' | grep -Pv '\.(gz|[0-9])$';
 echo;
 iostat /dev/sda2"
#Out­put
Every 3.0s: find /var/lib /var/log -type f -size +800k -mmin -10 -printf '%s \t %t %p\n'...; iostat /dev/sda2...

4362053          Mon Aug 29 08:25:15.1573410540 2022 /var/lib/redis/dump.rdb
67108864         Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/undo_001
3276800          Mon Aug 29 08:29:36.1604593970 2022 /var/lib/mysql/#innodb_redo/#ib_redo10127
83886080         Mon Aug 29 08:29:33.6604295080 2022 /var/lib/mysql/mysql.ibd
31459279         Mon Aug 29 08:29:34.2284362990 2022 /var/lib/mysql/binlog.005694
5242880          Mon Aug 29 08:29:33.6284291260 2022 /var/lib/mysql/SCloud/oc_authtoken.ibd
6291456          Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/SCloud/oc_jobs.ibd
50331648         Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/undo_002
79691776         Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/ibdata1
11873325         Mon Aug 29 08:29:01.5600457430 2022 /var/log/syslog
31441772         Mon Aug 29 08:29:08.0921238280 2022 /var/log/auth.log
1504046          Mon Aug 29 08:25:15.2373420090 2022 /var/log/redis/redis-server.log
8388608          Mon Aug 29 08:27:04.5226471030 2022 /var/log/journal/e8dsfe54457bd2f6a44344e1/user-1000.journal
33554432         Mon Aug 29 08:29:08.5241289930 2022 /var/log/journal/e8dsfe54457bd2f6a44344e1/system.journal
1183995          Mon Aug 29 08:25:17.2973666000 2022 /var/log/apache2/wiki.error.log
1170720          Mon Aug 29 08:25:17.3013666480 2022 /var/log/apache2/wiki.access.log
1166389          Mon Aug 29 08:25:14.0613279700 2022 /var/log/apache2/cloud.access.log
1243871          Mon Aug 29 08:25:13.9613267760 2022 /var/log/apache2/bg.mirror.access.log

Linux 5.15.0-46-generic (szs.space) 	08/29/22 	_x86_64_	(16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.50    0.00    0.73    1.06    0.00   95.71

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
sda2             55.71       454.89       741.44       421.11   32361849   52747168   29958424

Bench­mark and Mon­i­tor­ing Tools

  • io­zone – filesys­tem bench­mark – it is a filesys­tem bench­mark tool. The bench­mark gen­er­ates and mea­sures a va­ri­ety of file op­er­a­tions. io­zone has been port­ed to many ma­chines and runs un­der many op­er­at­ing sys­tems. This doc­u­ment will cov­er the many dif­fer­ent types of op­er­a­tions that are test­ed as well as cov­er­age of all of the com­mand line op­tions.
  • htop [-dCFh­pustvH] – in­ter­ac­tive process view­er – It is sim­i­lar to top, but al­lows you to scroll ver­ti­cal­ly and hor­i­zon­tal­ly, and in­ter­act us­ing a point­ing de­vice (mouse). You can ob­serve all process­es run­ning on the sys­tem, along with their com­mand line ar­gu­ments, as well as view them in a tree for­mat, se­lect mul­ti­ple process­es and act­ing on them all at once. Tasks re­lat­ed to process­es (killing, renic­ing) can be done with­out en­ter­ing their PIDs.
  • top [-hv|-bcEeHiOSs1 ‑d secs ‑n max ‑u|U user ‑p pids ‑o field ‑w [cols]] – dis­play Lin­ux process­es – it pro­vides a dy­nam­ic re­al-time view of a run­ning sys­tem. It can dis­play sys­tem sum­ma­ry in­for­ma­tion as well as a list of process­es or threads cur­rent­ly be­ing man­aged by the Lin­ux ker­nel. The types of sys­tem sum­ma­ry in­for­ma­tion shown and the types, or­der and size of in­for­ma­tion dis­played for process­es are all user con­fig­urable and that con­fig­u­ra­tion can be made per­sis­tent across restarts.
  • atop – Ad­vanced Sys­tem & Process Mon­i­tor – The pro­gram atop is an in­ter­ac­tive mon­i­tor to view the load on a Lin­ux sys­tem. It shows the oc­cu­pa­tion of the most crit­i­cal hard­ware re­sources (from a per­for­mance point of view) on sys­tem lev­el, i.e. cpu, mem­o­ry, disk and net­work. It al­so shows which process­es are re­spon­si­ble for the in­di­cat­ed load with re­spect to cpu and mem­o­ry load on process lev­el. Disk load is shown per process if "stor­age ac­count­ing is ac­tive in the ker­nel. Net­work load is shown per process if the ker­nel mod­ule ne­tatop has been in­stalled.

Ref­er­ences