Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: lack of cpu temperature metrics on netdata 1.46 #18065

Closed
k0ste opened this issue Jul 3, 2024 · 5 comments
Closed

[Bug]: lack of cpu temperature metrics on netdata 1.46 #18065

k0ste opened this issue Jul 3, 2024 · 5 comments
Labels
bug needs triage Issues which need to be manually labelled

Comments

@k0ste
Copy link
Contributor

k0ste commented Jul 3, 2024

Bug description

After upgrade netdata packages from 1.45.6 to 1.46.1 - no more netdata_sensors_temperature_Celsius_average metric is avail

Expected behavior

netdata provides cpu temperature metrics

Steps to reproduce

  1. build rpm packages 1.45.6
  2. install packages
  3. check for the metric netdata_sensors_temperature_Celsius_average
  4. build rpm packages 1.46.1
  5. check for the metric netdata_sensors_temperature_Celsius_average

Installation method

from source

System info

Linux ceph-osd3.opentech.local 4.18.0-553.6.1.el8.x86_64 #1 SMP Thu May 30 04:13:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
/etc/centos-release:CentOS Stream release 8
/etc/os-release:NAME="CentOS Stream"
/etc/os-release:VERSION="8"
/etc/os-release:ID="centos"
/etc/os-release:ID_LIKE="rhel fedora"
/etc/os-release:VERSION_ID="8"
/etc/os-release:PLATFORM_ID="platform:el8"
/etc/os-release:PRETTY_NAME="CentOS Stream 8"
/etc/os-release:ANSI_COLOR="0;31"
/etc/os-release:CPE_NAME="cpe:/o:centos:centos:8"
/etc/os-release:REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
/etc/os-release:REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"
/etc/redhat-release:CentOS Stream release 8
/etc/system-release:CentOS Stream release 8

Netdata build info

Packaging:
    Netdata Version ____________________________________________ : v1.46.1
    Installation Type __________________________________________ : custom
    Package Architecture _______________________________________ : unknown
    Package Distro _____________________________________________ : unknown
    Configure Options __________________________________________ : dummy-configure-command
Default Directories:
    User Configurations ________________________________________ : /etc/netdata
    Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
    Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
    Permanent Databases ________________________________________ : /var/lib/netdata
    Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
    Static Web Files ___________________________________________ : /usr/share/netdata/web
    Log Files __________________________________________________ : /var/log/netdata
    Lock Files _________________________________________________ : /var/lib/netdata/lock
    Home _______________________________________________________ : /var/lib/netdata
Operating System:
    Kernel _____________________________________________________ : Linux
    Kernel Version _____________________________________________ : 4.18.0-553.6.1.el8.x86_64
    Operating System ___________________________________________ : CentOS Stream
    Operating System ID ________________________________________ : centos
    Operating System ID Like ___________________________________ : rhel fedora
    Operating System Version ___________________________________ : 8
    Operating System Version ID ________________________________ : none
    Detection __________________________________________________ : /etc/os-release
Hardware:
    CPU Cores __________________________________________________ : 24
    CPU Frequency ______________________________________________ : 3500000000
    RAM Bytes __________________________________________________ : 269569814528
    Disk Capacity ______________________________________________ : 174427403943936
    CPU Architecture ___________________________________________ : x86_64
    Virtualization Technology __________________________________ : none
    Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
    Container __________________________________________________ : none
    Container Detection ________________________________________ : systemd-detect-virt
    Container Orchestrator _____________________________________ : none
    Container Operating System _________________________________ : none
    Container Operating System ID ______________________________ : none
    Container Operating System ID Like _________________________ : none
    Container Operating System Version _________________________ : none
    Container Operating System Version ID ______________________ : none
    Container Operating System Detection _______________________ : none
Features:
    Built For __________________________________________________ : Linux
    Netdata Cloud ______________________________________________ : YES
    Health (trigger alerts and send notifications) _____________ : YES
    Streaming (stream metrics to parent Netdata servers) _______ : YES
    Back-filling (of higher database tiers) ____________________ : YES
    Replication (fill the gaps of parent Netdata servers) ______ : YES
    Streaming and Replication Compression ______________________ : YES (zstd gzip)
    Contexts (index all active and archived metrics) ___________ : YES
    Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
    Machine Learning ___________________________________________ : YES
Database Engines:
    dbengine (compression) _____________________________________ : YES (zstd)
    alloc ______________________________________________________ : YES
    ram ________________________________________________________ : YES
    none _______________________________________________________ : YES
Connectivity Capabilities:
    ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
    static (Netdata internal web server) _______________________ : YES
    h2o (web server) ___________________________________________ : YES
    WebRTC (experimental) ______________________________________ : NO
    Native HTTPS (TLS Support) _________________________________ : YES
    TLS Host Verification ______________________________________ : YES
Libraries:
    LZ4 (extremely fast lossless compression algorithm) ________ : NO
    ZSTD (fast, lossless compression algorithm) ________________ : YES
    zlib (lossless data-compression library) ___________________ : YES
    Brotli (generic-purpose lossless compression algorithm) ____ : NO
    protobuf (platform-neutral data serialization protocol) ____ : YES (system)
    OpenSSL (cryptography) _____________________________________ : YES
    libdatachannel (stand-alone WebRTC data channels) __________ : NO
    JSON-C (lightweight JSON manipulation) _____________________ : YES
    libcap (Linux capabilities system operations) ______________ : NO
    libcrypto (cryptographic functions) ________________________ : YES
    libyaml (library for parsing and emitting YAML) ____________ : YES
Plugins:
    apps (monitor processes) ___________________________________ : YES
    cgroups (monitor containers and VMs) _______________________ : YES
    cgroup-network (associate interfaces to CGROUPS) ___________ : YES
    proc (monitor Linux systems) _______________________________ : YES
    tc (monitor Linux network QoS) _____________________________ : YES
    diskspace (monitor Linux mount points) _____________________ : YES
    freebsd (monitor FreeBSD systems) __________________________ : NO
    macos (monitor MacOS systems) ______________________________ : NO
    statsd (collect custom application metrics) ________________ : YES
    timex (check system clock synchronization) _________________ : YES
    idlejitter (check system latency and jitter) _______________ : YES
    bash (support shell data collection jobs - charts.d) _______ : YES
    debugfs (kernel debugging metrics) _________________________ : YES
    cups (monitor printers and print jobs) _____________________ : YES
    ebpf (monitor system calls) ________________________________ : YES
    freeipmi (monitor enterprise server H/W) ___________________ : YES
    nfacct (gather netfilter accounting) _______________________ : NO
    perf (collect kernel performance events) ___________________ : YES
    slabinfo (monitor kernel object caching) ___________________ : YES
    Xen ________________________________________________________ : NO
    Xen VBD Error Tracking _____________________________________ : NO
    Logs Management ____________________________________________ : YES
Exporters:
    AWS Kinesis ________________________________________________ : NO
    GCP PubSub _________________________________________________ : NO
    MongoDB ____________________________________________________ : NO
    Prometheus (OpenMetrics) Exporter __________________________ : YES
    Prometheus Remote Write ____________________________________ : YES
    Graphite ___________________________________________________ : YES
    Graphite HTTP / HTTPS ______________________________________ : YES
    JSON _______________________________________________________ : YES
    JSON HTTP / HTTPS __________________________________________ : YES
    OpenTSDB ___________________________________________________ : YES
    OpenTSDB HTTP / HTTPS ______________________________________ : YES
    All Metrics API ____________________________________________ : YES
    Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
    Trace All Netdata Allocations (with charts) ________________ : NO
    Developer Mode (more runtime checks, slower) _______________ : NO

Additional info

# netdata 1.45.6~ % curl -Ss "http://192.168.100.11:19999/api/v1/allmetrics?format=prometheus_all_hosts&server=srv2&source=average&variables=yes" | grep -c sensors
27    <----------
# netdata 1.46.1~ % curl -Ss "http://192.168.100.10:19999/api/v1/allmetrics?format=prometheus_all_hosts&server=srv2&source=average&variables=yes" | grep -c sensors
0     <----------

We have 3 netdata hosts: one 1.45.6 and two 1.46.1. No temperature metrics from version 1.46.1
Screenshot 2024-07-03 at 21 19 10

@k0ste k0ste added bug needs triage Issues which need to be manually labelled labels Jul 3, 2024
@ilyam8
Copy link
Member

ilyam8 commented Jul 3, 2024

Hi. See v1.46.0 changed collectors.

@ilyam8 ilyam8 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 3, 2024
@k0ste
Copy link
Contributor Author

k0ste commented Jul 3, 2024

Hi. See v1.46.0 changed collectors.

As I can see, the python/sensors was rewritten to go/sensors


How sensors work previously - python plugin uses lm_sensors C lib. This package was installed by Anaconda in minimal installation
Currently, package is not provide the weak dependency to resolve this this. Will try to fix this in #18067

@k0ste
Copy link
Contributor Author

k0ste commented Jul 3, 2024

Further more, I don't see any mention, that metric will be renamed. The old and the new one

  • netdata_sensors_temperature_Celsius_average ->
  • netdata_sensors_sensor_temperature_Celsius_average

Like for me, it seems like this is should be in deprecation warnings...

And third, the labels also renamed

The old labels:

  • chart="sensors.coretemp-isa-0000_temperature"
  • dimension="Package id 0"
  • family="temperature"

The new labels:

  • chart="sensors.sensor_chip_coretemp-isa-0000_feature_package_id_0_subfeature_temp1_input_temperature"
  • chip="coretemp-isa-0000"
  • dimension="temperature"
  • family="temperature"
  • feature="Package id 0"

So, now netdata provides duplicated labels dimension & family in one metric. And also the label dimension, which provides name of the sensor, renamed to feature label. This change was really necessary?


All together it looks like degradation

@ilyam8
Copy link
Member

ilyam8 commented Jul 3, 2024

Yes, this change was really necessary and there is mention about changed metric names.

@k0ste
Copy link
Contributor Author

k0ste commented Jul 3, 2024

there is mention about changed metric names.

Yes, there is a general mention. There is just no list of what was changed for what, or any example of how to get such a list. It’s one thing if the metric is used on charts - after the update it will be noticeable. If the metric is used for an alert, it will most likely only be known when the alert does not arrive 🥲

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs triage Issues which need to be manually labelled
Projects
None yet
Development

No branches or pull requests

2 participants