-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not attempt to renew certificate no longer used #3376
Comments
I'm seeing this behavior on Traefik 1.7-rc2 as well |
Is there any known workaround for this? Edit: I removed concerned certificate from acme.json and restarted traefik did the trick. |
I'm seeing this too, given the ephemerality of services that Traefik targets, it would make sense to remove not attempt to renew certificates that are not present on any services. It would be even better if those certificates were removed, maybe after some time if a service is just momentarily offline. |
Any news on this? I also ran into an issue where my traefik had requested a huge number of certs for non existing frontends. Bash oneliner guy i am, i hacked some commands together to clean those old certs out of the acme.json. I use the api endpoint of the dashboard for this. I leave this here for anyone, but please think before you do something. I only use kubernetes backend. How I removed unused certsI take no warranty for your copy paste job! What worked for me, maybe fail on your environment Note: This could be archived in many ways. I did not choose the shortest nor i played code golf. This is meant to be a little bit human readable at least. I simply jump directly into my traefik container and did the folling:
apk update && apk add jq curl
curl -s "https://<USERNAME>:<PASSWORD>@<TRAEFIK_DASHBOARD_URL>/api" | jq ".kubernetes.frontends" | jq "keys" | jq -r ".[]" | sed "s/\/.*//" | uniq > existing_frontends
cat /acme/acme.json | jq ".Certificates" | jq ".[]" | jq ".Domain" | jq -r ".Main" | sort | uniq > existing_certs
diff existing_certs existing_frontends | tail -n 4 | grep "^-" | sed "s/^-//" > certs_to_remove
cp /acme/acme.json /acme/acme.json.new
cat certs_to_remove | xargs -i sh -c "jq 'del(.Certificates[]| select(.Domain.Main == \"{}\"))' /acme/acme.json.new > /acme/acme.json.new2; mv /acme/acme.json.new2 /acme/acme.json.new"
cat /acme/acme.json.new | jq ".Certificates[].Domain.Main"
chmod 600 /acme/acme.json.new
cp /acme/acme.json /acme/acme.json.bak
echo "remove next # - think before you do something"
#cp /acme/acme.json.new /acme/acme.json after all that work i deleted my traefik pod to cleanup everything and enjoyed a cup of coffee as reward. I take no warranty for your copy paste job! |
This is especially annoying, when the certificates are stored in KV store (consul in our case) which limits the size of the acme.json object. We spin up instances on demand and tear them down after couple of days. But the certificates stay in the file and eventually preventing new certificates from being created. The only workaround for me here is to stop traefik, semi-manually remove the obsolete certificates, push the new file to the KV store and start traefik again. For a non-existing URL it does not make sense to renew and can be removed. Once the URL would be used again traefik can request the certificate again. |
also confirm this. this are the logs from the traefik:v1.7.9 docker container:
This is a script wrote a while back in which I made manual changes in order to removed the old/bad/unused/migrated certs from the consul acme.json: After this a manual push of the cert is required made. Ensure that you don't corrupt your acme.json while editing (you'll still have the original backups at FILE_ORIGINAL_BASE64) You need to run this on one of your consul servers (script tested on ubuntu 16.04) (in some cases had to reboot the entire setup to make the changes visible, still didn't found a stable/easy/logical way of making the update).
|
This comment has been minimized.
This comment has been minimized.
For what its worth I created a Makefile that is tested with Traefik 2.2.x acmefile = acme.json
traefik_dashboard = <TRAEFIK_DASHBOARD_URL>
auth_user = <USERNAME>
auth_password = <PASSWORD>
.SILENT: clean
.PHONY: clean
clean:
curl -s "https://$(auth_user):$(auth_password)@$(traefik_dashboard)/api/http/routers" | jq -r ".[]" | jq ".rule" | sed "s/\"Host(\`//g;s/\`)\"//g" | uniq > existing_frontends;
cat $(acmefile) | jq ".default.Certificates[].domain.main" | sort | uniq | sed "s/\"//g" > existing_certs;
awk 'NR==FNR{a[$$0];next}!($$0 in a)' existing_frontends existing_certs > certs_to_remove;
cp $(acmefile) $(acmefile).new;
cat certs_to_remove | xargs -I'{}' -i sh -c "jq 'del(.default.Certificates[]| select(.domain.main == \"{}\"))' $(acmefile).new > $(acmefile).new2; mv $(acmefile).new2 $(acmefile).new";
chmod 600 $(acmefile).new;
chown traefik:docker $(acmefile).new;
mv $(acmefile) $(acmefile).bak;
mv $(acmefile).new $(acmefile);
rm certs_to_remove existing_certs existing_frontends;
docker-compose restart; |
Wow, |
Just a side-note: There was a related ticket, where there is an API call mentioned that should be implemented: #7082 |
I am also worried about hitting the rate limit for for URLs that may have been pointed to other addresses. It has been ~5 months since the last comment, have any of you found a better solution? (or is the solution still to run one of the above scripts?) |
@MarkErik Basically it's still one of those scripts. I ran traefik in k8s and started migrating to ingress-nginx and cert manger: That scales better (multi-pod with Certs) and has better cert-management. |
@ldez is anyone on this issue?
@derjohn better cert-management? would you mind to elaborate? and how was the general setup of both nginx and cert-manager compared to Traefik 2.x? Edit: Just found that e.g. https://voyagermesh.com/docs/v2021.04.24-rc.0/guides/certificate/delete/ has a cert delete feature, this would be also handy for Traefik. |
@205g0 well you just need to read up on / try the cert-manager. Certs are not buried in a json file that can be in any of the supported PVs somewhere, and instead are exposed as first class Kubernetes Secret objects, there are traceable Order / Challenge objects for acme (with lifecycle events) and crs/cert for all certs. Also there is support for Vault, self-signed and external CAs that's absent in traefik. All that amounts to better cert-management. |
@AndrewSav thanks! Yesterday, I had checked cert-manager but somehow I was afraid or just too lazy to set it up. Which is the best ingress to pair it with? The Kubernetes nginx or the one from Nginx Inc. or an entirely different ingress? Or just with Traefik? |
@205g0 cert-manager works well with all of them. I'm using traefik as ingress in my kuberentes clusters, and it's working well for me. |
Can definitely recommend the cert-manager and acme-dns way. Using this to get wildcart lets-encrypt certs refreshed works nice for me. ditched traefik for ambassador a while ago and never looked back. |
@MaxWinterstein cert-manager is def the way to go, even if it is paired with Traefik, better than Traefik's built-in resolver. Ambassador is on my list, I checked Contour and Gloo before because people compared them all the time to Ambassador and yeah. I got Contour running, nice docs but it seems not be very feature-rich but somewhat a community. Gloo looks best on paper but I couldn't get it to run, so yeah... you're happy with Ambassador? Any drawbacks? |
I filled bug #9162. It was closed as a duplicate of this one. While the issue is similar, I do not feel it's a duplicate, as this issue relates to a whole certificate not used anymore, mine was referencing a SAN in a still used certificate that was removed. |
Do you want to request a feature or report a bug?
Bug
What did you do?
I booted one service in one server, behind Traefik, configured with docker backend and correctly labeled.
Everything worked fine:
But, after some time, i moved the service to another equally-configured server.
What did you expect to see?
Traefik in old server stop renewing cert if no container is using it actively.
What did you see instead?
These logs:
Output of
traefik version
: (What version of Traefik are you using?)What is your environment & configuration (arguments, toml, provider, platform, ...)?
Docker backend, CLI flags configuration... but I don't think is relevant for this issue, where everything is working as expected.
The text was updated successfully, but these errors were encountered: