Skip to content

Conversation

@yonran
Copy link

@yonran yonran commented Dec 30, 2025

Add ATA smartctl_device_power_mode gauge to implement feature requests #310, #195. Looks like a number from 0 to 255 (from the source of smartmon ataprint.cpp from the latest ATA specification; see draft of ACS-3):

  • -1=sleep
  • 0x00=0=standby
  • 0x01=1=standby_y
  • 0x40=64=active_nv_down
  • 0x41=65=active_nv_up
  • 0x80=128=idle
  • 0x81=129=idle_a
  • 0x82=130=idle_b
  • 0x83=131=idle_c
  • 0xff=255=active_or_idle

Note that this does not return NVME power states (nvmeprint.cpp).

Screenshot 2025-12-29 at 6 35 43 PM
  • Add smartctl_device_power_mode from power_mode.ata_value and allow low‑power --nocheck=standby responses to be cached (so sleeping drives still export smartctl_device_power_mode/
    smartctl_device_smartctl_exit_status when exit‑status bit 1 is set with power_mode present).
    • Side effect: when a drive was active but then becomes standby, then do NOT return the cached value anymore. Instead, JSON without SMART metrics is cached.
  • Skip emitting metrics when required fields are missing in standby JSON:
    • smartctl_device
    • smartctl_device_capacity_blocks
    • smartctl_device_capacity_bytes
    • smartctl_device_block_size
    • nvme
      • smartctl_device_percentage_used
      • smartctl_device_available_spare
      • smartctl_device_available_spare_threshold
      • smartctl_device_critical_warning
      • smartctl_device_media_errors
      • smartctl_device_num_err_log_entries

I skip emitting metrics when they do not exist in the JSON so that they don’t turn to 0 when the field does not exist in the json. Without that commit (28d568e), the gauges jump to 0 each time the disk is in standby state. For example, here is a disk that shuts down after 45s of idle because I ransudo hdparm -S 10 /dev/sdc:
Screenshot 2025-12-29 at 5 14 44 PM

Note that other metrics such as smartctl_device_power_on_seconds already were empty instead of 0 when they don't exist in the JSON:
Screenshot 2025-12-29 at 6 04 50 PM

@Preclowski
Copy link

Hey @yonran, thank you for your effort. Are you able to finalize this thing by signing your commits? I believe thats why it cant be merged, as DCO check fails. It would be really nice to see that metric included ;)

Im not related with this project, its just my guess about DCO.

Export power mode state from smartctl's power_mode JSON field. This allows monitoring which drives are spinning vs sleeping without waking them up during collection.

Gauge: smartctl_device_power_mode{device}. The value is the ata_value: 0=standby, 255=active, etc.

Cache JSON for standby drives (when smartctl --nocheck=standby exit code is 2) instead of returning stale data from when it was active.

Signed-off-by: Yonathan Randolph <yonathan@gmail.com>
When smartctl returns standby (exit status bit 1), we cache the minimal JSON so power_mode can still be exported. That JSON omits capacity, block size, device info, and NVMe health fields, so collectors must skip those metrics when fields are missing to avoid emitting zeros or empty-label series.

Signed-off-by: Yonathan Randolph <yonathan@gmail.com>
@yonran yonran force-pushed the smartctl_device_power_mode branch from 0c1f17a to f9023ae Compare January 29, 2026 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants