Linux I/O Monitoring and Analyze

From WikiMLT

There is a cou­ple of tools avail­able that al­lows you to mon­i­tor and an­a­lyze the disk I/O per­for­mance of your Lin­ux dri­ven sys­tem. Here are list­ed few of them and al­so how to in­stall and ex­am­ples of their ba­sic us­age.

The htop com­mand

If a new­er ver­sion of htop is avail­able at your dis­tri­b­u­tion, there is avail­able an ad­di­tion­al tab that shows the I/O met­rics of the in­stance – Screen 1. Here is how to check the avail­able ver­sion and in­stall htop.

sudo apt show htop 2>/dev/null | grep '^Version'
sudo apt install htop

In­stall the lat­est ver­sion of htop 3.2.1–1 on Ubun­tu Serv­er 22.04.1 from a .deb pack­age.

cd /tmp
wget --no-check-certificate https://http.us.debian.org/debian/pool/main/h/htop/htop_3.2.1-1_amd64.deb
sudo apt install ./htop_3.2.1-1_amd64.deb

To be able to see all da­ta in most cas­es you need to run the tool as root:

sudo htop
Screen 1. The new I/O Met­rics tab of htop (v 3.2+). Use Tab to switch to the I/O tab, then use F6 to open the Sort by menu, and sort by IO_WRITE_RATE. The screen­shot is tak­en on Kali Lin­ux 2022. Screen 1. The new I/O Metrics tab of htop (v 3.2+). Use Tab to switch to the I/O tab, then use F6 to open the Sort by menu, and sort by IO_WRITE_RATE. The screenshot is taken on Kali Linux 2022.

The iostat com­mand

io­stat – Re­port Cen­tral Pro­cess­ing Unit (CPU) sta­tis­tics and input/​​​output sta­tis­tics for de­vices and par­ti­tions. The io­stat com­mand is used for mon­i­tor­ing sys­tem input/​​​output de­vice load­ing by ob­serv­ing the time the de­vices are ac­tive in re­la­tion to their av­er­age trans­fer rates…

The first re­port gen­er­at­ed by the io­stat com­mand pro­vides sta­tis­tics con­cern­ing the time since the sys­tem was boot­ed, un­less the -y op­tion is used. Each sub­se­quent re­port cov­ers the time since the pre­vi­ous re­port. All sta­tis­tics are re­port­ed each time the io­stat com­mand is run. The re­port con­sists of a CPU head­er row fol­lowed by a row of CPU sta­tis­tics. On mul­ti­proces­sor sys­tems, CPU sta­tis­tics are cal­cu­lat­ed sys­tem-wide as av­er­ages among all proces­sors. A de­vice head­er row is dis­played fol­lowed by a line of sta­tis­tics for each de­vice that is con­fig­ured…

Here is how to get the gen­er­al re­port in hu­man read­able for­mat.

iostat -h
Linux 5.18.0-kali5-amd64 (kali-x) 	08/31/2022 	_x86_64_	(24 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.4%    0.0%    0.2%    0.1%    0.0%   99.2%

      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
    11.50       224.2k        20.9k         0.0k       1.2G     119.2M       0.0k dm-0
    10.90       221.8k        20.9k         0.0k       1.2G     119.2M       0.0k dm-1
    10.99       315.7k        56.2k         0.0k       1.8G     320.0M       0.0k nvme0n1
     0.06         1.6k         0.0k         0.0k       8.9M       0.0k       0.0k sda
     4.14        55.1k         0.0k         0.0k     313.8M     152.0k       0.0k sdb

Here is how to get re­port per de­vice, per 1 minute, with time­stamp in hu­man read­able for­mat. Note the first re­port pro­vides sta­tis­tics con­cern­ing the time since the sys­tem was boot­ed, the lat­er re­ports pro­vide sta­tis­tic per 60 sec­onds.

iostat -h /dev/nvme0n1 -d 60 -t
Linux 5.18.0-kali5-amd64 (kali-x) 	08/31/2022 	_x86_64_	(24 CPU)

08/31/2022 05:39:22 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
    10.45       297.5k        53.6k         0.0k       1.8G     324.5M       0.0k nvme0n1

08/31/2022 05:40:22 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     1.10         0.6k        10.2k         0.0k      36.0k     612.0k       0.0k nvme0n1

Here is how to get the same as the above sta­tis­tic but con­cerned to a LVM log­i­cal vol­ume.

iostat -h /dev/mapper/kali--x--vg-home -d 60 -t
Linux 5.18.0-kali5-amd64 (kali-x) 	08/31/2022 	_x86_64_	(24 CPU)

08/31/2022 05:45:11 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     4.02        68.4k        19.7k         0.0k     437.2M     125.9M       0.0k dm-5

08/31/2022 05:46:11 PM
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     0.30         0.0k         1.1k         0.0k       0.0k      68.0k       0.0k dm-5

Watch the changes in the full sta­tis­tics per half sec­ond. In the the fol­low­ing ex­am­ple:

  • io­stat ‑y ‑h – sup­press the re­port since the boot time, hu­man read­able for­mat,
  • io­stat 1 1[ in­ter­val [ count ] ] – one count per one sec­ond.
  • watch ‑n 0.5 ‑d – re­fresh per 0.5 sec­onds, show the dif­fer­ence
watch -n 0.5 -d "iostat -h -y 1 1"

Dis­play ex­tend­ed sta­tis­tics for the whole sys­tem, in hu­man read­able for­mat, with time­stamp, per 6 sec­onds.

iostat -x -h -t 6

The iotop com­mand

The dstat com­mand

dstat is a ver­sa­tile tool for gen­er­at­ing sys­tem re­source sta­tis­tics, it is a ver­sa­tile re­place­ment for vmstat, iostat and ifstat. Dstat is unique in let­ting you ag­gre­gate block de­vice through­put for a cer­tain diskset or net­work band­width for a group of in­ter­faces, ie. you can see the through­put for all the block de­vices that make up a sin­gle filesys­tem or stor­age sys­tem.

sudo apt install dstat

There is an un­count­able mul­ti­tude of op­tions and plu­g­ins avail­able for dstat. Here is one ex­am­ple of us­age – Screen 3 – where are ap­plied the fol­low­ing op­tions.

  • -D sdc – adds col­umn that re­ports the I/O rate of /dev/sdc.
  • -t, --time – en­able time/​​​date out­put.
  • -a, --all – equals to -cdngy (-c cpu, -d disk; -n en­able net­work stats; -g en­able page stats; -y en­able sys­tem stats).
  • --top-io – show most ex­pen­sive I/O process.
  • --top-bio – show most ex­pen­sive block I/O process.
  • --top-mem – show process us­ing the most mem­o­ry.
sudo dstat -D sda -ta --top-io --top-bio --top-mem
Screen 3. Example of usage of the dstat command.
Screen 3. Ex­am­ple of us­age of the dstat com­mand. Screen 3. Example of usage of the dstat command.

Here is an­oth­er ex­am­ple that will out­put the av­er­age I/O rate per minute.

dstat -tdD total 60
----system--------dsk/total--
     time      |  read  writ
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
29-08 08:40:13 | 1138M 1782M
29-08 08:41:13 |  234k  744k
29-08 08:42:13 |  293k  171k
29-08 08:43:13 |  268k  113k
29-08 08:44:13 | 1100k  129k

The vmstat com­mand

vmstat – vir­tu­al mem­o­ry sta­tis­tics – re­ports in­for­ma­tion about process­es, mem­o­ry, pag­ing, block IO, traps, disks and cpu ac­tiv­i­ty. The first re­port pro­duced gives av­er­ages since the last re­boot. Ad­di­tion­al re­ports give in­for­ma­tion on a sam­pling pe­ri­od of length de­lay. The process and mem­o­ry re­ports are in­stan­ta­neous in ei­ther case.

Here is how to get sta­tis­tics about the block de­vices – -d, in megabytes -Sm (1000000), or -SM (1048576) bytes.

sudo vmstat -d -Sm
disk-   ------------reads--------------- --------------writes--------------- ------IO-----
        total  merged   sec tors      ms    total  merged   sectors       ms    cur    sec
loop0      84       0      2554       81        0       0         0        0      0      0
loop1      84       0      2382       36        0       0         0        0      0      0
loop2      52       0       856       39        0       0         0        0      0      0
loop3      60       0       814       43        0       0         0        0      0      0
loop4      52       0       764       14        0       0         0        0      0      0
loop5     539       0     11200      240        0       0         0        0      0      0
loop6      87       0      2498      105        0       0         0        0      0      0
loop7     493       0     34854      228        0       0         0        0      0      1
sda   1284880  157343  65276608   752987  3936160 2077213 135566208  4487853      0   4123
sdc    437039  119521   4760810  2594193    96132  145500  54786232 10747453      0   3024
sdd   2614304  458294  24597746  6017154    63873 1360249  19094048  7394053      0   5798
sdb    136351    1445  34564266   329162    25383    2759  47585536  4447925      0    424
sr0       120       0      897        14        0       0         0        0      0      0
loop8      49       0      752        24        0       0         0        0      0      0
loop9      88       0     3334        42        0       0         0        0      0      0
loop10     11       0       28         0        0       0         0        0      0      0

The sar com­mand

The sar com­mand is part of the pack­age sysstat. It out­puts the con­tents of se­lect­ed cu­mu­la­tive ac­tiv­i­ty coun­ters in the  op­er­at­ing  sys­tem. The ac­tiv­i­ties are col­lect­ed by the sysstat.service. Af­ter in­stalling the pack­age we need to en­able the col­lec­tor ser­vice and wait un­til some sta­tis­tics are col­lect­ed.

sudo apt install sysstat
sudo sed -i 's/ENABLED="false"/ENABLED="true"/' /etc/default/sysstat
sudo systemctl enable --now sysstat.service
systemctl cat sysstat-collect.timer
# /lib/systemd/system/sysstat-collect.timer
# /lib/systemd/system/sysstat-collect.timer
# (C) 2014 Tomasz Torcz <tomek@pipebreaker.pl>
#
# sysstat-12.5.2 systemd unit file:
#        Activates activity collector every 10 minutes

[Unit]
Description=Run system activity accounting tool every 10 minutes

[Timer]
OnCalendar=*:00/10

[Install]
WantedBy=sysstat.service
sar
Linux 5.15.39-4-pve (ubuntu-lxc-pve) 	08/28/22 	_x86_64_	(24 CPU)

20:41:48     LINUX RESTART	(24 CPU)

20:50:05        CPU     %user     %nice   %system   %iowait    %steal     %idle
21:00:00        all      1.66      0.00      0.26      0.02      0.00     98.06
21:10:10        all      2.66      0.00      0.27      0.03      0.00     97.03
21:20:13        all      1.92      0.00      0.29      0.02      0.00     97.76
Average:        all      2.09      0.00      0.27      0.03      0.00     97.62

Mon­i­tor the Files Size Changes Re­cur­sive­ly

By the fol­low­ing com­mand we can mon­i­tor which are the most writ­ten files for the past 10 min­utes, larg­er than 800 Kb. This is done re­cur­sive­ly for the di­rec­to­ries /var/lib and /var/log. The out­put of the com­mand is shown at Screen 3.

sudo watch -n 3 -d \
"find /var/lib /var/log -type f -size +800k -mmin -10 -printf '%-30s \t %t %p\n' | grep -Pv '\.(gz|[0-9])$'"
Screen 3. Use watch and find to monitor file change in real time.
Screen 3. Use watch and find to mon­i­tor file change in re­al time. Screen 3. Use watch and find to monitor file change in real time.

Here is an ad­vanced ver­sion :) which out­puts al­so an ad­di­tion­al da­ta gen­er­at­ed by iostat:

sudo watch -n 3 -d \
"find /var/lib /var/log -type f -size +800k -mmin -10 -printf '%-30s \t %t %p\n' | grep -Pv '\.(gz|[0-9])$';
 echo;
 iostat /dev/sda2"
#Out­put
Every 3.0s: find /var/lib /var/log -type f -size +800k -mmin -10 -printf '%s \t %t %p\n'...; iostat /dev/sda2...

4362053          Mon Aug 29 08:25:15.1573410540 2022 /var/lib/redis/dump.rdb
67108864         Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/undo_001
3276800          Mon Aug 29 08:29:36.1604593970 2022 /var/lib/mysql/#innodb_redo/#ib_redo10127
83886080         Mon Aug 29 08:29:33.6604295080 2022 /var/lib/mysql/mysql.ibd
31459279         Mon Aug 29 08:29:34.2284362990 2022 /var/lib/mysql/binlog.005694
5242880          Mon Aug 29 08:29:33.6284291260 2022 /var/lib/mysql/SCloud/oc_authtoken.ibd
6291456          Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/SCloud/oc_jobs.ibd
50331648         Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/undo_002
79691776         Mon Aug 29 08:29:34.6284410810 2022 /var/lib/mysql/ibdata1
11873325         Mon Aug 29 08:29:01.5600457430 2022 /var/log/syslog
31441772         Mon Aug 29 08:29:08.0921238280 2022 /var/log/auth.log
1504046          Mon Aug 29 08:25:15.2373420090 2022 /var/log/redis/redis-server.log
8388608          Mon Aug 29 08:27:04.5226471030 2022 /var/log/journal/e8dsfe54457bd2f6a44344e1/user-1000.journal
33554432         Mon Aug 29 08:29:08.5241289930 2022 /var/log/journal/e8dsfe54457bd2f6a44344e1/system.journal
1183995          Mon Aug 29 08:25:17.2973666000 2022 /var/log/apache2/wiki.error.log
1170720          Mon Aug 29 08:25:17.3013666480 2022 /var/log/apache2/wiki.access.log
1166389          Mon Aug 29 08:25:14.0613279700 2022 /var/log/apache2/cloud.access.log
1243871          Mon Aug 29 08:25:13.9613267760 2022 /var/log/apache2/bg.mirror.access.log

Linux 5.15.0-46-generic (szs.space) 	08/29/22 	_x86_64_	(16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.50    0.00    0.73    1.06    0.00   95.71

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
sda2             55.71       454.89       741.44       421.11   32361849   52747168   29958424

Mis­cel­la­neous

Bench­mark Tools

  • io­zone – filesys­tem bench­mark – it is a filesys­tem bench­mark tool. The bench­mark gen­er­ates and mea­sures a va­ri­ety of file op­er­a­tions. io­zone has been port­ed to many ma­chines and runs un­der many op­er­at­ing sys­tems. This doc­u­ment will cov­er the many dif­fer­ent types of op­er­a­tions that are test­ed as well as cov­er­age of all of the com­mand line op­tions. Man­u­al with ex­am­ples: kongll​.github​.io/​i​ozone
  • hd­parm – get/​​​set SATA/IDE de­vice pa­ra­me­ters – pro­vides a com­mand line in­ter­face to var­i­ous ker­nel in­ter­faces sup­port­ed by the Lin­ux SATA/PATA/SAS "li­ba­ta" sub­sys­tem and the old­er IDE dri­ver sub­sys­tem. Per­form read test:
sudo hdparm -tT /dev/nvme0n1
  • dd – con­vert and copy a file – per­form write and read tests (re­al­ly sim­pli­fied just as a note):
dd if=/dev/zero of=./test.file bs=4096k count=4096
dd if=./test.file of=/dev/zero bs=4096k count=4096

Mon­i­tor­ing Tools

  • htop [-dCFh­pustvH] – in­ter­ac­tive process view­er – It is sim­i­lar to top, but al­lows you to scroll ver­ti­cal­ly and hor­i­zon­tal­ly, and in­ter­act us­ing a point­ing de­vice (mouse). You can ob­serve all process­es run­ning on the sys­tem, along with their com­mand line ar­gu­ments, as well as view them in a tree for­mat, se­lect mul­ti­ple process­es and act­ing on them all at once. Tasks re­lat­ed to process­es (killing, renic­ing) can be done with­out en­ter­ing their PIDs.
  • top [-hv|-bcEeHiOSs1 ‑d secs ‑n max ‑u|U user ‑p pids ‑o field ‑w [cols]] – dis­play Lin­ux process­es – it pro­vides a dy­nam­ic re­al-time view of a run­ning sys­tem. It can dis­play sys­tem sum­ma­ry in­for­ma­tion as well as a list of process­es or threads cur­rent­ly be­ing man­aged by the Lin­ux ker­nel. The types of sys­tem sum­ma­ry in­for­ma­tion shown and the types, or­der and size of in­for­ma­tion dis­played for process­es are all user con­fig­urable and that con­fig­u­ra­tion can be made per­sis­tent across restarts.
  • atop – Ad­vanced Sys­tem & Process Mon­i­tor – The pro­gram atop is an in­ter­ac­tive mon­i­tor to view the load on a Lin­ux sys­tem. It shows the oc­cu­pa­tion of the most crit­i­cal hard­ware re­sources (from a per­for­mance point of view) on sys­tem lev­el, i.e. cpu, mem­o­ry, disk and net­work. It al­so shows which process­es are re­spon­si­ble for the in­di­cat­ed load with re­spect to cpu and mem­o­ry load on process lev­el. Disk load is shown per process if "stor­age ac­count­ing is ac­tive in the ker­nel. Net­work load is shown per process if the ker­nel mod­ule ne­tatop has been in­stalled.

Ref­er­ences