-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deduplicate all netdata strings #13570
Conversation
Hello @ktsaou, Your PR is not in Best regards! |
@thiagoftsm I am done I think. Let's test it and merge it, so that I can continue with other changes. |
Thank you for the update, I will start testing it right now. 🤝 |
Hello @ktsaou , Except for the already mentioned warnings from Codacy, I ran the following successful tests:
After you take decision about Codacy, we can merge the PR. |
Thank you @thiagoftsm ! I decided to proceed a little more. There are a few changes I need to do, that require this work to be merged. So, I will push more commits and then we will review it again. I am doing thousands of little changes. If you or anyone else can actually review the changes, it would be great! Most of the changes are simple search-replace, but in a few cases, especially conditions, some changes were required. So, if you can review the code, please do. |
Hello @ktsaou , While I was testing your code, I also reviewed it, right now I am missing only the last commit. I will review it and test the modified codes. 🤝 |
@ktsaou on the same 2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0032@streaming/compressio:lz4_compressor_): STREAM_COMPRESSION: Compressor Reset
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0561@streaming/sender.c :rrdpush_sender_): STREAM potinho [send to 192.168.1.6:19999]: established communication with a parent using protocol version 5
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0753@streaming/sender.c :rrdpush_queue_i): STREAM potinho [send to 192.168.1.6:19999]: sending metric definitions...
2022-09-02 16:52:03: netdata FATAL : STREAM_SENDER[potinho] : (0012@database/rrdset.c :__rrdset_check_): RRDSET 'system.idlejitter' should be read-locked, but it is not, at function need_to_send_chart_definition() at line 178 of file 'streaming/rrdpush.c' # : Success
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0031@daemon/main.c :netdata_cleanup): EXIT: netdata prepares to exit with code 1...
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0042@daemon/main.c :netdata_cleanup): EXIT: cleaning up the database...
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (1504@database/rrdhost.c :rrdhost_cleanup): Cleaning up database [1 hosts(s)]...
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (1459@database/rrdhost.c :rrdhost_cleanup): Cleaning up database of host 'potinho'...
2022-09-02 16:52:03: netdata ERROR : STREAM_SENDER[potinho] : (0223@libnetdata/locks/loc:__netdata_rwloc): RW_LOCK: failed to obtain read lock (code 35)
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0140@database/sqlite/sqli:sql_close_conte): Closing context SQLite database
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0564@database/sqlite/sqli:sql_close_datab): Closing SQLite database
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0144@database/sqlite/sqli:add_stmt_to_lis): No statements pending to finalize
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0072@daemon/main.c :netdata_cleanup): EXIT: removing netdata PID file '/run/netdata/netdata.pid'...
2022-09-02 16:52:03: netdata INFO : STREAM_SENDER[potinho] : (0080@daemon/main.c :netdata_cleanup): EXIT: all done - netdata is now exiting - bye bye...
EOF found in spawn pipe.
Shutting down spawn server event loop. When I disable streaming I can run |
Thank you @thiagoftsm ! Good find! I updated |
@thiagoftsm can you please check again if the previous crash of |
@ktsaou a quick update:
|
4bbce60
to
d618a40
Compare
@ktsaou the issues were fixed, I will rebase the branches and start another round of test. |
@stelfrag and @ktsaou, yesterday I ran another round of tests after commits, and I observed system stable and running fine on |
…exes from rrdhost; and many more fixes
… lockless when they dont need to touch Judy - only Judy is protected with a read/write lock
…ed; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS
Guys, I don't have any open issues about this PR. Please review it. If no problem is found I want it merged today. |
All right, it will be my high priority for today/tomorrow (September 5th). 🤝 |
…sts do not index hosts
Considering I am going to do different tests during the whole day, I will use this comment and I will be updating it until the tests end. Compilation on different platformsThis PR was tested with success on:
StreamingThe streaming worked as expected on the following environment:
I also tested in the scenario where
I am rebasing branch now with latest @stelfrag commit before to proceed. After the PR to be merged, I began to use master branch for next tests: API accessI used our script to access thousands of URLs and everything worked as expected. HealthAll alerts present in this link were raised as expected. ExporterThe exporting engine is working as expected, I used |
It is merged guys! Thank you! |
@ktsaou and @stelfrag I am still testing this branch, I won't approve it, but I will update the comment #13570 (comment). |
This PR replaces all
char *
in RRD structures withSTRING *
.Once this is done in all structures and is merged, it is expected to have a big impact on the memory footprint on busy servers, since
STRING *
deduplicate the strings. So, all these contexts, dimensions, variables, etc that have the same name across dimensions, charts, nodes, alarms, etc will use the memory only once.Most of the changes are trivial (mostly a search/replace).
Also, this PR replaces AVL trees with DICTIONARIES for the following:
This change was required, because with the STRINGs we now don't need hashes to compare them, but in order to lookup a value in the AVL tree, we had to create a STRING, which resulted in thousands of string lookup per second on the string dictionary, which made things slower. By replacing AVL with DICTIONARIES, we can now enjoy all the benefits of string deduplication, without any performance penalties.
The following indexes were not useful and have been removed:
This was only needed during loading from disk on startup. It was replaced with a temporary dictionary.
This was only needed to find if an alarm entry of an rrdcalc is repeating. It is now turned to a flag of the alarm entries for the index is not needed.
Although linked lists should be replaced by dictionaries, which also support reference counting, this still needs some work. Still all the code base relies on linked lists.
Until we eliminate them, most of the linked lists are now double linked, to provide constant insertion and deletion times:
For the double linked list management, there are 5 very useful macros in libnedata.h, so this is an attempt to standardize them a bit.
The value for users is important:
With the current implementation, on a single node, strings are allocated 10 times on the average (6k entries, referenced 60k times). I expect the final to be around 20 times. The impact on busy parent agents will be really great.
We also saved about 200 bytes per chart, because now the
RRDSET.id
is not preallocated.On busy parents we observed several GBs in memory reduction of the agent.
Other fixes