added md disks in down state #3007

Finomosec · 2024-04-30T00:12:21Z

Added missing mdadm stats:

node_md_disks # added {state="down"}
node_md_sync_time_remaining (seconds)
node_md_blocks_synced_speed
node_md_blocks_synced_pct

Notes:

One drive was not being shown, as it was in state="down" (recovering), which was not reported in the output.
Using node_md_blocks_synced / node_md_blocks as progress percentage created wrong results on my system, as the total-blocks differed from the total-to-be-synced-blocks. This may be due to the raid-level being used (raid5).

md0 : active raid5 sdf1[4] sde1[1] sdc1[2] sdb1[0]
      14650718208 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [===================>.]  recovery = 99.9% (4882207424/4883572736) finish=7.8min speed=2908K/sec
      bitmap: 2/37 pages [8KB], 65536KB chunk

Signed-off-by: Finomosec <[email protected]>

… node_md_blocks_synced_pct Signed-off-by: Finomosec <[email protected]>

Signed-off-by: Finomosec <[email protected]>

changed sync_minutes_remaining to sync_time_remaining (in seconds) Signed-off-by: Frederic <[email protected]>

fixed code formatting Signed-off-by: Frederic <[email protected]>

Signed-off-by: Finomosec <[email protected]>

Finomosec · 2024-05-03T15:25:43Z

@SuperQ
I'm done for now.
Feel free to merge it at any time.

collector/mdadm_linux.go

SuperQ · 2024-05-07T21:48:50Z

collector/mdadm_linux.go

+	blockSyncedSpeedDesc = prometheus.NewDesc(
+		prometheus.BuildFQName(namespace, "md", "blocks_synced_speed"),
+		"current sync speed (in Kilobytes/sec)",
+		[]string{"device"},
+		nil,
+	)


This doesn't seem necessary, we should be able to compute this from something like rate(node_md_blocks_synced[1m]) * <blocksize>.

Suggested change

blockSyncedSpeedDesc = prometheus.NewDesc(

prometheus.BuildFQName(namespace, "md", "blocks_synced_speed"),

"current sync speed (in Kilobytes/sec)",

[]string{"device"},

nil,

)

I think it is usefull. It is the CURRENT speed, as it is shown in /stat/proc/mdstat
I have it showing in my Grafana board.
Plus i guess <blocksize> is not included in the data, so it would require additional configuration for each md-device.

My Grafana Board: https://grafana.com/grafana/dashboards/20989-node-exporter-mdadm-status/

Note that the groundwork has already been laid for #1085, and we probably should not add any new parsing functionality relating to /proc/mdstat.

SuperQ · 2024-05-07T21:52:19Z

Maybe instead of exposing the sync percent, we should expose the "TODO" blocks value. This way the completion ratio can be correctly calculated as node_md_blocks_synced / node_md_blocks_synced_todo.

renamed pct to percent Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

added unit "seconds" Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

Finomosec · 2024-05-09T09:28:07Z

Maybe instead of exposing the sync percent, we should expose the "TODO" blocks value. This way the completion ratio can be correctly calculated as node_md_blocks_synced / node_md_blocks_synced_todo.

That was my first idea, too, but the data-source (https://github.com/prometheus/procfs/blob/master/mdstat.go) does not (yet) capture/expose this value.

Also node_md_blocks_synced_todo is not a good name. todo sound like remaining, which is not correct.
Maybe to_be_synced would suffice.

But hey! We could calculate it using blocks_synced and the percentage.
What do you think, should we do this?

... but it might be imprecise, especially for low percentage values plus it might yield slightly different results over time, which would be kind of akward.

So maybe better not after all.

I added a request to add it:
prometheus/procfs#636

Signed-off-by: Finomosec <[email protected]>

discordianfish · 2024-05-13T11:54:50Z

Yeah I agree, let's add the TODO blocks to procfs

SuperQ · 2024-05-14T09:43:55Z

Released updated procfs: https://github.com/prometheus/procfs/releases/tag/v0.15.0

added md disks in down state

21be960

Signed-off-by: Finomosec <[email protected]>

Finomosec force-pushed the master branch from 8a2e622 to 21be960 Compare April 30, 2024 00:15

Finomosec and others added 7 commits May 1, 2024 11:51

added node_md_sync_minutes_remaining, node_md_blocks_synced_speed and…

58f48d1

… node_md_blocks_synced_pct Signed-off-by: Finomosec <[email protected]>

fixed unit-test expected output

27b93f8

Signed-off-by: Finomosec <[email protected]>

fixed unit-test expected output

3179601

Signed-off-by: Finomosec <[email protected]>

Update mdadm_linux.go

f21e3b5

changed sync_minutes_remaining to sync_time_remaining (in seconds) Signed-off-by: Frederic <[email protected]>

Update mdadm_linux.go

dd0a449

fixed code formatting Signed-off-by: Frederic <[email protected]>

fixed unit-test expected output

70a889f

Signed-off-by: Finomosec <[email protected]>

fixed unit-test expected output

9b3de10

Signed-off-by: Finomosec <[email protected]>

SuperQ requested changes May 7, 2024

View reviewed changes

Finomosec and others added 2 commits May 9, 2024 11:23

Update collector/mdadm_linux.go

4d743c4

renamed pct to percent Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

Update collector/mdadm_linux.go

6df2222

added unit "seconds" Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

fixed unit-test expected output

cf0dffe

Signed-off-by: Finomosec <[email protected]>

Finomosec requested a review from SuperQ May 9, 2024 09:39

Merge branch 'prometheus:master' into master

af65f85

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added md disks in down state #3007

added md disks in down state #3007

Finomosec commented Apr 30, 2024 •

edited

Loading

Finomosec commented May 3, 2024

SuperQ May 7, 2024

Finomosec May 9, 2024 •

edited

Loading

Finomosec May 9, 2024

dswarbrick May 14, 2024

SuperQ commented May 7, 2024

Finomosec commented May 9, 2024 •

edited

Loading

discordianfish commented May 13, 2024

SuperQ commented May 14, 2024

added md disks in down state #3007

Are you sure you want to change the base?

added md disks in down state #3007

Conversation

Finomosec commented Apr 30, 2024 • edited Loading

Finomosec commented May 3, 2024

SuperQ May 7, 2024

Choose a reason for hiding this comment

Finomosec May 9, 2024 • edited Loading

Choose a reason for hiding this comment

Finomosec May 9, 2024

Choose a reason for hiding this comment

dswarbrick May 14, 2024

Choose a reason for hiding this comment

SuperQ commented May 7, 2024

Finomosec commented May 9, 2024 • edited Loading

discordianfish commented May 13, 2024

SuperQ commented May 14, 2024

Finomosec commented Apr 30, 2024 •

edited

Loading

Finomosec May 9, 2024 •

edited

Loading

Finomosec commented May 9, 2024 •

edited

Loading