Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File provider parsing errors should be limited to a single file #10890

Closed
2 tasks done
TheRealGramdalf opened this issue Jul 8, 2024 · 2 comments
Closed
2 tasks done

Comments

@TheRealGramdalf
Copy link

TheRealGramdalf commented Jul 8, 2024

Welcome!

  • Yes, I've searched similar issues on GitHub and didn't find any.
  • Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you expect to see?

Reopening of #8921, as part of my effort to write a better NixOS Module.
As mentioned in the previous issue, having at least one file which fails to parse causes all dynamic file configuration to be ignored.

This is an issue in part because traefik currently does not provide a way to validate a configuration without applying it to a running daemon (see also #10889, #10804), though more importantly a matter of stability. A single error/typo can bring down the entire provider, which often includes access to the dashboard itself, cutting off access to valuable debugging utilities.
The infrastructure for failing middlewares/routers/services is already in place: if in a router you reference a service which does not exist, the router will be marked as failed on the dashboard, considered unhealthy, and ignored as a possible route. The directory provider should follow the same logic. If files need to be separated so that middlewares are defined (and thus fail) separately than routers, that should be up to the administrator.

@jspdown
Copy link
Contributor

jspdown commented Jul 12, 2024

Hello @TheRealGramdalf and thanks for opening this issue.

having at least one file which fails to parse causes all dynamic file configuration to be ignored

As stated in #8921 (comment) this is by design.
This is done to prevent Traefik from using an incomplete configuration, potentially disrupting a working setup.

Deep down, the issue lies in the validation of the configuration prior to its application. As you've noted, #10889 is the key. We've answered to that particular issue and any help on that is very welcome.

@jspdown jspdown closed this as not planned Won't fix, can't repro, duplicate, stale Jul 12, 2024
@jspdown jspdown closed this as completed Jul 15, 2024
@TheRealGramdalf
Copy link
Author

TheRealGramdalf commented Jul 18, 2024

This is done to prevent Traefik from using an incomplete configuration, potentially disrupting a working setup.

The way this currently works, it actually does disrupt a working setup. If I add a new service, and make a simple syntax mistake, it brings down every other service traefik currently serves. I don't believe this is the intention behind the dynamic provider.

In fact, take the following example:

services:
  traefik:
    image: "traefik:v3.0.3"
    ports:
      - 80:80
      - 8080:8080
    command:
      # Log level: INFO|DEBUG|ERROR
      - --log.level=DEBUG
      - --api.dashboard=true
      - --api.insecure=true
      - --providers.docker.watch=true
    labels:
      traefik.http.routers.api.rule: "Host(`*`)"
      traefik.http.routers.api.service: api@internal
      traefik.http.routers.api.middlewares: "local-only"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
  #whoami:
  #  image: traefik/whoami
  #  labels:
  #    traefik.http.middlewares.local-only.ipallowlist.sourcerange: "192.168.1.0/24"
  whoami:
    image: traefik/whoami
    labels:
      traefik.http.services.whoami.loadbalancer.server.port: 80
      traefik.http.routers.whoami.service: whoami
      traefik.http.routers.whoami.rule: "Host(`localhost`)"

This is a minimal example with no syntax errors, but a misconfigured middleware. The dashboard is considered failed (which can be seen by accessing the dashboard directly on :8080), but the whoami service continues to function.
All well and good; this is the case with the dynamic directory provider.

Now, introduce a syntax error in traefik labels:

services:
  traefik:
    image: "traefik:v3.0.3"
    ports:
      - 80:80
      - 8080:8080
    command:
      # Log level: INFO|DEBUG|ERROR
      - --log.level=DEBUG
      - --api.dashboard=true
      - --api.insecure=true
      - --providers.docker.watch=true
    labels:
      traefik.http.routers.api.rule: "Host(`*`)"
      traefik.http.routers.api.service: api@internal
      traefik.http.routers.api.middlewar: "local-only" # <<< Syntax error here (middlewar, not middleware)
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
  #whoami:
  #  image: traefik/whoami
  #  labels:
  #    traefik.http.middlewares.local-only.ipallowlist.sourcerange: "192.168.1.0/24"
  whoami:
    image: traefik/whoami
    labels:
      traefik.http.services.whoami.loadbalancer.server.port: 80
      traefik.http.routers.whoami.service: whoami
      traefik.http.routers.whoami.rule: "Host(`localhost`)"

According to the aforementioned design choices, the whoami service/router should now be unavailable. Instead, navigating to the dashboard reveals whoami to be functioning flawlessly, with the api service nowhere to be found.

This implies two main things:

  • Behaviour is not consistent across dynamic providers regarding syntax errors
  • Syntax errors are only reported in logs, and do not show up in the dashboard

I don't think saying "It's not a bug, it's a feature" is really correct here- as demonstrated within traefik itself, this behaviour is perfectly acceptable, and the logic clearly exists to some extent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants