-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statsd Cluster Proxy #284
Statsd Cluster Proxy #284
Conversation
😍 |
❤️ |
, net = require('net') | ||
, events = require('events') | ||
, hashring = require('hashring') | ||
, config = require('./proxyConfig'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to use the config module here? (and adapt it where we need to)
I think the function (regular and anonymous in the event listener) could use better/more formal documentation. But otherwise 🚢 |
…tch to using the health status
@draco2003 do we have any concerns about merging this at this point? Do we need more testing? |
the test suite fails on 2 tests, at least on my box, with the cluster_proxy branch: the suite passes on master branch.
|
- added a logger like stats.js - converted call to `util.log()` to use logger `l.log()` (would throw exception otherwise since util was not imported) - if no backends are available, the app would exit with an error calling `split()` on undefined `statsd_host` returned from `ring.get()`. Instead, catch this and report an error in the logs, but do not exit.
I sent a PR against this branch to fix a few things I noticed in testing. #319 |
Add logging to notify when a backend node is brought into the ring. Also there is a slight change in logic here as previously `ring.add(node)` would be called on every healthcheck (every 1s). This was probably harmless since it appears to be idempotent but I don"t believe it"s necessary and would make the logging of node additions excessive.
add additional logging
I"m assuming it"s a good idea to use |
In the current iteration it does require deleteIdleStats:true and makes sense to put in the docs. |
@draco2003 thanks for the headsup. What is the reason why you can"t use Is your branch w/ admin flush command in a shareable state yet? I"d like to test. |
@draco2003 I didn"t notice that the statsd mgmt port already had some support for resetting stats, although it doesn"t support all stats yet. It does support the ones we commonly use. Currently testing this new patch: joemiller@456860b |
Not shareable at this point, but your patch looks good. Only issue i see is that you are doing a single write, which the admin interface will only see as one command and won"t reset the other two stat types i believe. You might have to send three separate writes. |
Yeah I noticed that as well after some quick testing. updated joemiller/statsd@etsy:cluster_proxy...cluster_proxy |
Does it catch the error? |
thanks for the heads up. fixed On Thu, Aug 1, 2013 at 9:38 AM, Dan Rowe [email protected] wrote:
|
FYI, it does not appear ipv6 is working. It looked like it was (no errors, things are running, metrics are flowin, etc) but upon closer inspection metrics were only being sent to the statsd instance on the same node as the statsd-proxy instance. Switching back to |
Also, to aid in anyone"s testing, here"s a quick one-liner I have been using to ensure that no more than one instance of statsd owns any particular metric:
There should not be any metrics listed with a count > 1 |
Been using this code for a few weeks now and it is working well. With a 4-node cluster, here is the distribution and rate of metrics coming in via statsd-proxy (the stats are actually collected via statsd itself, not statsd-proxy since it does not expose its internals as metrics like statsd): The patch I proposed above to delete stats upon cluster membership change was not working as expected, perhaps due to a race condition so I went back to setting |
@joemiller are those 1 minute buckets? |
yes On Tue, Aug 27, 2013 at 12:36 AM, Luke Venediger
|
@joemiller and I"m assuming the load was spread over two boxes? |
4 boxes. they are 8gb rackspace cloud vm"s On Tue, Aug 27, 2013 at 7:43 AM, Luke Venediger [email protected]:
|
@joemiller last question I promise! What did the box health stats look like? CPU, ram, oops? |
Oops -> iops |
thanks for sharing @joemiller this is very interesting. @draco2003 is there anything keeping us from merging this into master? |
At this point, no. I think we should merge it in prior to rolling the next On Tue, Oct 1, 2013 at 11:51 AM, Daniel Schauenberg <
|
looking forward to the full release.! we"re still using this in production with good success. I hope to do a |
Awesome! @draco2003 can you merge it then? @joemiller looking forward to that write up :) |
Sounds like a plan!
|
Merging in the Statsd Cluster Proxy
@joemiller Mind sending over a pull request with your latest changes to the proxy and i"ll get those merged in as well. |
Sure. I sent over PR: #348 However, my attempt at resetting stats on ring-membership change was not On Thu, Oct 3, 2013 at 6:01 AM, Dan Rowe [email protected] wrote:
|
Initial P.O.C. of a statsd cluster proxy
Was initially built at Monitorama Hackathon
monitorama/hackathon#56