Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: web_log log_type config not working #18036

Closed
tete1030 opened this issue Jun 29, 2024 · 1 comment · Fixed by #18037
Closed

[Bug]: web_log log_type config not working #18036

tete1030 opened this issue Jun 29, 2024 · 1 comment · Fixed by #18037

Comments

@tete1030
Copy link

tete1030 commented Jun 29, 2024

Bug description

go.d/web_log.conf:

jobs:
  - name: nginx
    path: /host/npm_logs/*_access.log
    parser:
      log_type: regexp
      regexp_config:
        pattern: '^(?P<time_local>\[[^\]] \]) (?:(?P<upstream_cache_status>-|MISS|BYPASS|EXPIRED|STALE|UPDATING|REVALIDATED|HIT) (?P<upstream_status>:\d ) )?(?P<status>\d ) - (?P<request_method>[A-Z] ) (?P<scheme>\w ) (?P<host>[^ ] ) "(?P<request_uri>[^"] )" \[Client (?P<remote_addr>[^ ] )\] \[Length (?P<body_bytes_sent>\d )\] \[Gzip (?P<gzip_ratio>[^ ] )\] (?:\[Sent-to (?P<server>[^ ] )\] )?"(?P<http_user_agent>[^"] )" "(?P<http_referer>[^"] )"'
    

Given the above config, netdata should use my providing pattern for parsing logs. However it is still trying to use log_type: auto. The log clearly show some inconsistency.

Running ./go.d.plugin -d -m web_log, I noted interesting lines between ******:

time=2024-06-29T19:35:05.016 08:00 level=debug msg="plugin: name=go.d, version=v1.46.1" plugin=go.d component=agent
time=2024-06-29T19:35:05.016 08:00 level=debug msg="current user: name=root, uid=0" plugin=go.d component=agent
time=2024-06-29T19:35:05.016 08:00 level=info msg="env HTTP_PROXY '', HTTPS_PROXY ''" plugin=go.d component=agent
time=2024-06-29T19:35:05.017 08:00 level=info msg="instance is started" plugin=go.d component=agent
time=2024-06-29T19:35:05.017 08:00 level=info msg="loading config file" plugin=go.d component=agent
time=2024-06-29T19:35:05.017 08:00 level=debug msg="looking for 'go.d.conf' in [/etc/netdata /usr/lib/netdata/conf.d]" plugin=go.d component=agent
time=2024-06-29T19:35:05.017 08:00 level=info msg="found '/etc/netdata/go.d.conf" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="config successfully loaded" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="using config: enabled 'true', default_run 'true', max_procs '0'" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="loading modules" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="enabled/registered modules: 1/92" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="building discovery config" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=debug msg="looking for 'web_log.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=debug msg="found '/etc/netdata/go.d/web_log.conf" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="dummy/read/watch paths: 0/1/0" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="registered discoverers: [file discovery: [file reader] service discovery]" plugin=go.d component="discovery manager"
time=2024-06-29T19:35:05.018 08:00 level=debug msg="looking for 'vnodes/' in [/etc/netdata /usr/lib/netdata/conf.d]" plugin=go.d component=agent
time=2024-06-29T19:35:05.018 08:00 level=info msg="found '/usr/lib/netdata/conf.d/vnodes' (0 vhosts)" plugin=go.d component=agent
time=2024-06-29T19:35:05.019 08:00 level=info msg="instance is started" plugin=go.d component="discovery manager"
time=2024-06-29T19:35:05.019 08:00 level=info msg="instance is started" plugin=go.d component="functions manager"
time=2024-06-29T19:35:05.019 08:00 level=info msg="instance is started" plugin=go.d component="job manager"
time=2024-06-29T19:35:05.019 08:00 level=debug msg="registering function 'config'" plugin=go.d component="functions manager"
CONFIG go.d:collector:web_log create accepted template /collectors/jobs internal 'internal' 'add schema enable disable test userconfig' 0x0000 0x0000

time=2024-06-29T19:35:05.019 08:00 level=info msg="instance is started" plugin=go.d component=discovery discoverer=file
time=2024-06-29T19:35:05.019 08:00 level=info msg="instance is started" plugin=go.d component=discovery discoverer=file
time=2024-06-29T19:35:05.020 08:00 level=info msg="instance is stopped" plugin=go.d component=discovery discoverer=file
time=2024-06-29T19:35:05.020 08:00 level=info msg="instance is started" plugin=go.d component="service discovery"
time=2024-06-29T19:35:05.020 08:00 level=debug msg="received configs: 1/ 1/-0 ('/etc/netdata/go.d/web_log.conf')" plugin=go.d component="job manager"
CONFIG go.d:collector:web_log:nginx create accepted job /collectors/jobs user 'discoverer=file_reader,file=/etc/netdata/go.d/web_log.conf' 'schema get enable disable update restart test userconfig' 0x0000 0x0000

************************************************* This line shows the config parsing was correct *********************
time=2024-06-29T19:35:05.020 08:00 level=debug msg="creating web_log[nginx] job, config: map[__provider__:file reader __source__:discoverer=file_reader,file=/etc/netdata/go.d/web_log.conf __source_type__:user autodetection_retry:0 module:web_log name:nginx parser:map[log_type:regexp regexp_config:map[pattern:^(?P<time_local>\\[[^\\]] \\]) (?:(?P<upstream_cache_status>-|MISS|BYPASS|EXPIRED|STALE|UPDATING|REVALIDATED|HIT) (?P<upstream_status>:\\d ) )?(?P<status>\\d ) - (?P<request_method>[A-Z] ) (?P<scheme>\\w ) (?P<host>[^ ] ) \"(?P<request_uri>[^\"] )\" \\[Client (?P<remote_addr>[^ ] )\\] \\[Length (?P<body_bytes_sent>\\d )\\] \\[Gzip (?P<gzip_ratio>[^ ] )\\] (?:\\[Sent-to (?P<server>[^ ] )\\] )?\"(?P<http_user_agent>[^\"] )\" \"(?P<http_referer>[^\"] )\"]] path:/host/npm_logs/*_access.log priority:70000 update_every:1]" plugin=go.d component="job manager"
********************************************************************************************************************

time=2024-06-29T19:35:05.020 08:00 level=debug msg="skipping URL patterns creating, no patterns provided" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.021 08:00 level=debug msg="skipping custom fields creating, no custom fields provided" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.021 08:00 level=debug msg="skipping custom time fields creating, no custom time fields provided" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.021 08:00 level=debug msg="no custom time fields provided" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.021 08:00 level=debug msg="starting log reader creating" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.021 08:00 level=debug msg="open log file: /host/npm_logs/proxy-host-5_access.log" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.022 08:00 level=debug msg="created log reader, current file '/host/npm_logs/proxy-host-5_access.log'" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.022 08:00 level=debug msg="starting parser creating" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.022 08:00 level=debug msg="last line: '[29/Jun/2024:16:46:14  0800] - - 200 - GET http XXXXXX \"/XXXXXXX\" [Client XXXXX] [Length 104] [Gzip -] [Sent-to XXXXX] \"XXXXX\" \"-\"'" plugin=go.d collector=web_log job=nginx

************************************************* Somehow it's still using auto log_type *********************
time=2024-06-29T19:35:05.022 08:00 level=debug msg="log_type is auto, will try format auto-detection" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.022 08:00 level=debug msg="starting log type auto-detection" plugin=go.d collector=web_log job=nginx
********************************************************************************************************************

time=2024-06-29T19:35:05.022 08:00 level=debug msg="log type is CSV" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.022 08:00 level=debug msg="starting csv log format auto-detection" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.022 08:00 level=debug msg="config: {FieldsPerRecord:-1 Delimiter:  TrimLeadingSpace:false Format: CheckField:0x22de6e0}" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.023 08:00 level=debug msg="trying format: '$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time'" plugin=go.d collector=web_log job=nginx
time=2024-06-29T19:35:05.023 08:00 level=debug msg="parse: csv parse: assign 'host:$server_port': assign '[29/Jun/2024:16:46:14' : bad vhost with port" plugin=go.d collector=web_log job=nginx

Expected behavior

Should respect my config parser.log_type: regexp

Steps to reproduce

  1. Copy above config
  2. Run ./go.d.plugin -d -m web_log

Installation method

docker

System info

# Host:

Linux TexotUnraid 6.1.79-Unraid #1 SMP PREEMPT_DYNAMIC Fri Mar 29 13:34:03 PDT 2024 x86_64 Intel(R) Pentium(R) Silver N6005 @ 2.00GHz GenuineIntel GNU/Linux
/etc/os-release:NAME=Slackware
/etc/os-release:VERSION="15.0"
/etc/os-release:ID=slackware
/etc/os-release:VERSION_ID=15.0
/etc/os-release:PRETTY_NAME="Slackware 15.0 x86_64 (post 15.0 -current)"
/etc/os-release:ANSI_COLOR="0;34"
/etc/os-release:CPE_NAME="cpe:/o:slackware:slackware_linux:15.0"
/etc/os-release:VERSION_CODENAME=current

# Container:

Linux TexotUnraid 6.1.79-Unraid #1 SMP PREEMPT_DYNAMIC Fri Mar 29 13:34:03 PDT 2024 x86_64 GNU/Linux
/etc/os-release:PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
/etc/os-release:NAME="Debian GNU/Linux"
/etc/os-release:VERSION_ID="12"
/etc/os-release:VERSION="12 (bookworm)"
/etc/os-release:VERSION_CODENAME=bookworm
/etc/os-release:ID=debian

Netdata build info

Packaging:
    Netdata Version ____________________________________________ : v1.46.1
    Installation Type __________________________________________ : oci
    Package Architecture _______________________________________ : x86_64
    Package Distro _____________________________________________ : unknown
    Configure Options __________________________________________ : dummy-configure-command
Default Directories:
    User Configurations ________________________________________ : /etc/netdata
    Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
    Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
    Permanent Databases ________________________________________ : /var/lib/netdata
    Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
    Static Web Files ___________________________________________ : /usr/share/netdata/web
    Log Files __________________________________________________ : /var/log/netdata
    Lock Files _________________________________________________ : /var/lib/netdata/lock
    Home _______________________________________________________ : /var/lib/netdata
Operating System:
    Kernel _____________________________________________________ : Linux
    Kernel Version _____________________________________________ : 6.1.79-Unraid
    Operating System ___________________________________________ : Slackware
    Operating System ID ________________________________________ : slackware
    Operating System ID Like ___________________________________ : unknown
    Operating System Version ___________________________________ : 15.0
    Operating System Version ID ________________________________ : 12
    Detection __________________________________________________ : /host/etc/os-release
Hardware:
    CPU Cores __________________________________________________ : 4
    CPU Frequency ______________________________________________ : 2000000000
    RAM Bytes __________________________________________________ : 16625799168
    Disk Capacity ______________________________________________ : 22050095939584
    CPU Architecture ___________________________________________ : x86_64
    Virtualization Technology __________________________________ : none
    Virtualization Detection ___________________________________ : none
Container:
    Container __________________________________________________ : docker
    Container Detection ________________________________________ : dockerenv
    Container Orchestrator _____________________________________ : none
    Container Operating System _________________________________ : Debian GNU/Linux
    Container Operating System ID ______________________________ : debian
    Container Operating System ID Like _________________________ : unknown
    Container Operating System Version _________________________ : 12 (bookworm)
    Container Operating System Version ID ______________________ : 12
    Container Operating System Detection _______________________ : /etc/os-release
Features:
    Built For __________________________________________________ : Linux
    Netdata Cloud ______________________________________________ : YES
    Health (trigger alerts and send notifications) _____________ : YES
    Streaming (stream metrics to parent Netdata servers) _______ : YES
    Back-filling (of higher database tiers) ____________________ : YES
    Replication (fill the gaps of parent Netdata servers) ______ : YES
    Streaming and Replication Compression ______________________ : YES (zstd lz4 gzip)
    Contexts (index all active and archived metrics) ___________ : YES
    Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
    Machine Learning ___________________________________________ : YES
Database Engines:
    dbengine (compression) _____________________________________ : YES (zstd lz4)
    alloc ______________________________________________________ : YES
    ram ________________________________________________________ : YES
    none _______________________________________________________ : YES
Connectivity Capabilities:
    ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
    static (Netdata internal web server) _______________________ : YES
    h2o (web server) ___________________________________________ : YES
    WebRTC (experimental) ______________________________________ : NO
    Native HTTPS (TLS Support) _________________________________ : YES
    TLS Host Verification ______________________________________ : YES
Libraries:
    LZ4 (extremely fast lossless compression algorithm) ________ : YES
    ZSTD (fast, lossless compression algorithm) ________________ : YES
    zlib (lossless data-compression library) ___________________ : YES
    Brotli (generic-purpose lossless compression algorithm) ____ : NO
    protobuf (platform-neutral data serialization protocol) ____ : YES (system)
    OpenSSL (cryptography) _____________________________________ : YES
    libdatachannel (stand-alone WebRTC data channels) __________ : NO
    JSON-C (lightweight JSON manipulation) _____________________ : YES
    libcap (Linux capabilities system operations) ______________ : NO
    libcrypto (cryptographic functions) ________________________ : YES
    libyaml (library for parsing and emitting YAML) ____________ : YES
Plugins:
    apps (monitor processes) ___________________________________ : YES
    cgroups (monitor containers and VMs) _______________________ : YES
    cgroup-network (associate interfaces to CGROUPS) ___________ : YES
    proc (monitor Linux systems) _______________________________ : YES
    tc (monitor Linux network QoS) _____________________________ : YES
    diskspace (monitor Linux mount points) _____________________ : YES
    freebsd (monitor FreeBSD systems) __________________________ : NO
    macos (monitor MacOS systems) ______________________________ : NO
    statsd (collect custom application metrics) ________________ : YES
    timex (check system clock synchronization) _________________ : YES
    idlejitter (check system latency and jitter) _______________ : YES
    bash (support shell data collection jobs - charts.d) _______ : YES
    debugfs (kernel debugging metrics) _________________________ : YES
    cups (monitor printers and print jobs) _____________________ : NO
    ebpf (monitor system calls) ________________________________ : NO
    freeipmi (monitor enterprise server H/W) ___________________ : YES
    nfacct (gather netfilter accounting) _______________________ : NO
    perf (collect kernel performance events) ___________________ : YES
    slabinfo (monitor kernel object caching) ___________________ : YES
    Xen ________________________________________________________ : NO
    Xen VBD Error Tracking _____________________________________ : NO
    Logs Management ____________________________________________ : YES
Exporters:
    AWS Kinesis ________________________________________________ : NO
    GCP PubSub _________________________________________________ : NO
    MongoDB ____________________________________________________ : YES
    Prometheus (OpenMetrics) Exporter __________________________ : YES
    Prometheus Remote Write ____________________________________ : YES
    Graphite ___________________________________________________ : YES
    Graphite HTTP / HTTPS ______________________________________ : YES
    JSON _______________________________________________________ : YES
    JSON HTTP / HTTPS __________________________________________ : YES
    OpenTSDB ___________________________________________________ : YES
    OpenTSDB HTTP / HTTPS ______________________________________ : YES
    All Metrics API ____________________________________________ : YES
    Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
    Trace All Netdata Allocations (with charts) ________________ : NO
    Developer Mode (more runtime checks, slower) _______________ : NO

Additional info

The problem also happens on netdata 1.45.0

@tete1030 tete1030 added bug needs triage Issues which need to be manually labelled labels Jun 29, 2024
@ilyam8 ilyam8 added area/docs collectors/go.d and removed needs triage Issues which need to be manually labelled labels Jun 29, 2024
@ilyam8
Copy link
Member

ilyam8 commented Jun 29, 2024

Hi, @tete1030. That is a documentation bug, there is no parser (parser config is inlined). I updated docs in #18037.

  - name: nginx
    parser:
      log_type: regexp
      regexp_config:
        pattern: 'pattern'

=>

  - name: nginx
    log_type: regexp
    regexp_config:
      pattern: 'pattern'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants