Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

total_items seems have a weird behavior #1061

Open
YueTang-Vanessa opened this issue Jul 27, 2023 · 11 comments
Open

total_items seems have a weird behavior #1061

YueTang-Vanessa opened this issue Jul 27, 2023 · 11 comments

Comments

@YueTang-Vanessa
Copy link

Hello, when I am testing the memcached, I found the total_items have a weird behavior. From the protocol.txt, this stat refers to Total number of items stored since the server started. It looks like this should record all items ever created and stored in the server.

When I run following test in memcached 1.6.21, I found that prepend, append and sometimes incr will increase this stats( when incr an item that changed its original size). So I think the total_items change when items with updated size.

# prepend, append
version
VERSION 1.6.21
set test 0 0 5
hello
STORED
prepend test 0 0 3
hi!
STORED
append test 0 0 2
hi
STORED
stats
......
STAT curr_items 1
STAT total_items 3
......
set test2 0 0 1
5
STORED
incr test2 10
15
......
STAT curr_items 2
STAT total_items 5
......
incr test2 50
65
......
STAT curr_items 2
STAT total_items 5
decr test2 100
0
......
STAT curr_items 2
STAT total_items 5

But when set an existing key with same value, flag and expire time, the total_item still increase with no extra space necessarily need to allocate, and since touch and gat won't increase this stat, so looks like the change of ttl is not the rule to increase this metric.

set test 0 0 1
5
STORED
......
STAT curr_items 2
STAT total_items 6
......
touch test 10000
TOUCHED
......
STAT curr_items 2
STAT total_items 6
......
gat 0 test
VALUE test 0 1
5
END
stats
......
STAT curr_items 2
STAT total_items 6

Could you help to explain how this total_items expected to increase?

Thanks

@dormando
Copy link
Member

Hi.

total_items is increased every time memcached internally allocates memory. Memcached largely operates as "Copy on write" for updating data. So most operations allocate new data and replace an existing item with it, but some don't.

set: always allocates a new item, overwrites an existing item.
append/prepend: always allocates a new item, overwrites the existing item. internally copies the old item into the new item and appends the new data.
incr/decr: in most cases it will rewrite the data in place. if an item is very busy it cannot do that safely and will allocate and replace the item. Also if the length of the number no longer fits in the original slab class.

touch/gat: these commands only update item metadata while held under a lock, so they do not cause the item to be re-created.

under most workloads total_items ends up relating close to the set count.

@QuChen88
Copy link
Contributor

@dormando Does the user need to know whether memcached internally allocated memory or not in a write command? Wouldn't it be more straight-forward for the user if you simply define total_items to be the total number of new items you create in the cache so that updates made to an existing item wouldn't increment this counter?

@dormando
Copy link
Member

it's been like that long before I was involved in the project.

Issuing a set against memcached is replacing an item in cache, so semantically it should count as a new item. In practice total_items is the same as the number of set/append/prepend calls. it's a misunderstanding to think that it's about expanding item size. incr is the only odd one because it has an optimization so it usually doesn't count as a new item.

@QuChen88
Copy link
Contributor

As you mentioned, something about this metric doesn't make sense entirely, namely about expanding item size in incr operations. I actually think the original implementation of tracking memory allocation is not entirely correct either. But we can probably leave it as it is for now as this has been a legacy behavior for a long time.

@dormando
Copy link
Member

@QuChen88 what information were you hoping to get from the counter?

It's still useful from a developer perspective since it helps me understand what's happening under the hood of a user instance. As far as users are concerned I'm not sure what the potential usage could ever be. I'd hide the value from a standard dashboard.

I can see "how many sets weren't in the cache before" but not exactly sure what you would do with that information, and "total_items - cmd_set" gets pretty close to that regardless.

@QuChen88
Copy link
Contributor

The name implies to me that is it about how many keys I stored in memcached as a user. A user is typically less concerned about how memory allocation scheme works under the hood.

@dormando
Copy link
Member

Okay so you're just looking through all of the counters and writing internal documentation maybe?

I do the best I can by documenting every counter in doc/protocol.txt - I hope you've seen it. I know a lot of people who forked it have renamed the counters. and people who write graphs/monitoring connectors rename the counters... but all existing monitoring systems are based after these names. I don't have a lot of options on cleaning up counter names since many of these systems aren't maintained and people don't read release notes.

If the doc/protocol.txt explanations can be improved please ask or PR.

@QuChen88
Copy link
Contributor

QuChen88 commented Jul 29, 2023

Yes we were going over the list of documented stats counters and found this particular one to be a bit odd. Thanks for the clarification.

@dormando
Copy link
Member

I agree the documentation for that stat looks out of date. Please leave this issue open until it's updated.

@dormando
Copy link
Member

For fun I did a git dive.

total_items has had that documentation line since july 2003, when the file was first created. The code behavior has been the same since first commit in may 2003. It's always meant "the number of times an item was linked into the hash table". That was before incr/decr/append/prepend/etc were all added, so back then it pretty much meant the (successful?) set count.

interesting.

@dormando
Copy link
Member

just took another look at this. Not super sure what to do about it.

Leaning towards just removing the counter entirely. As I noted above it lost its meaning after incr/decr/append/prepend/etc were added. There are counters for all of the various commands and counters for the current number of items linked into the hash table.

I could add the bump instead to do_store_item only if it wasn't replacing an existing item... but then what is this telling you exactly? The information seems useless. It doesn't tell me as a developer anything useful with user reports since it doesn't represent any useful metric (total_items is currently at least somewhat useful as a developer). As a user it would tell you new items instead of replaced items... but replaced items are also technically new items? How are they different from expired items that got re-set later?

Seems like a semantic minefield so probably removal is the best idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants