-
Notifications
You must be signed in to change notification settings - Fork 1.1k
bug(rdb save): snapshot: on push data to channel check serializer len #532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: adi_holden <adi@dragonflydb.io>
@dranikpg Now the sink in snapshot is just a temporary sink before pushing to channel, serializer holds all the data till calling FlushDefaultBuffer, therefore we compare the default_serializer_->SerializedLen() . |
src/server/snapshot.cc
Outdated
if (auto comp = zstd_serializer_->Compress(payload); comp) { | ||
if (auto comp = zstd_serializer_->Compress(payload)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this on purpose, see #508
Not sure what side you will join 🙂
In this case, with declaring a new variable, it probably doesn't matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will undo this change
|
||
// TODO : drop default_buffer from this class, we dont realy need it. | ||
std::unique_ptr<io::StringFile> default_buffer_; // filled by default_serializer_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, because currently we always move out of the default buffer
However, in an optimized version, we shouldn't do this and should keep the default buffer.
With it, we can make use of the following:
- If we do compression, then we already copy the compressed part into a new string, so there is no reason the steal the default buffer and then drop it
- We just use it do to copy over data from the rdb_serializer, so why should we allocate a new buffer each time for this if we end up with case 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, instead of using FlushToSink at all, we should be able to get the io::Bytes/string_view from the serializer directly, so that we can do compression without an intermediate copy
But I guess that's another topic, off from the current discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to move the compression under RdbSerilaizer, so the flow will change, you can see the PR created already.
I agree on the the optimization for not allocating new buffer for each copy, I believe the flow will change more till I get to this TODO but in the current flow I believe we can call it tmp_buffer_ and use it in the temporary Serializer as will, right?
Anyway I am not going to change this now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I just looked at this first, the other PR solves this issue
bool SliceSnapshot::FlushDefaultBuffer(bool force) { | ||
if (!force && default_buffer_->val.size() < 4096) | ||
if (!force && default_serializer_->SerializedLen() < 4096) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
Signed-off-by: adi_holden <adi@dragonflydb.io>
No description provided.