-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
socket.gaierror
is raised if the host is not known
#130
Comments
Wow this is some... old code. This was originally added in a completely different place in #6, and has moved along for the last 8 years. I'd go so far as to say this is roughly deliberate but certainly open to re-evaluation, especially if there's a way to make it backwards compatible. It's also vaguely related to #125, since that would move the logic for when DNS resolution happens. I don't have a super strong opinion here, other than "we don't want to do DNS lookups on every metric". |
Ok; I had a few ideas for how this could be done, hopefully at some point I'll be able to open a PR. I'll note down a few ideas here in case anyone else wants to try:
|
* Allow the start node to not be an answer node * Upgrade requirements
Since f45589b, the application is instrumented and sends metrics to the Wikimedia Cloud Services statsd endpoint. It was initially sent to cloudmetrics1002, then switched to cloudmetrics1001 (6de2308). Since a few weeks, logs report issues reaching that endpoint [1]. This started on 2022-11-20 according to the otherwise fine Shell logs. (Not sure exactly when it started for the Python webservice, as the uwsgi logfile ballooned to 31 million lines, making precise investigation a tad difficult). It turns out cloudmetrics1002 (as well as cloudmetrics1001) were decommissioned on November 16th (see T297444). It also turns out the pystatsd library plain crashes if the statsd host is unreachable [2]. This replaces the host with the new cloudmetrics1003. Bug: T325936 [1] Python traceback: ``` Traceback (most recent call last): File "/data/project/integraality/www/python/src/app.py", line 10, in <module> from pages_processor import PagesProcessor, ProcessingException File "./pages_processor.py", line 16, in <module> from property_statistics import PropertyStatistics, QueryException File "./property_statistics.py", line 16, in <module> from statsd.defaults.env import statsd File "/data/project/integraality/www/python/venv/lib/python3.7/site-packages/statsd/defaults/env.py", line 17, in <module> maxudpsize=maxudpsize, ipv6=ipv6) File "/data/project/integraality/www/python/venv/lib/python3.7/site-packages/statsd/client/udp.py", line 35, in __init__ host, port, fam, socket.SOCK_DGRAM)[0] File "/usr/lib/python3.7/socket.py", line 748, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): ``` [2] jsocol/pystatsd#130
Curious to know if this is intended behaviour or not.
Example:
I'm wondering if it would be better to catch-pass this error? On the one hand, it's more in-line with the philosophy that errors with stats should not bring down the application, but on the other, I could imagine wanting to be sure that my application can log anything at all.
Perhaps a middle ground would be some kind of
strict=True/False
argument, which toggles between the behaviours?The text was updated successfully, but these errors were encountered: