Skip to content

bug(rdb save): snapshot: on push data to channel check serializer len #532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 5, 2022

Conversation

adiholden
Copy link
Contributor

No description provided.

Signed-off-by: adi_holden <adi@dragonflydb.io>
@adiholden adiholden requested a review from dranikpg December 5, 2022 10:04
@adiholden
Copy link
Contributor Author

@dranikpg Now the sink in snapshot is just a temporary sink before pushing to channel, serializer holds all the data till calling FlushDefaultBuffer, therefore we compare the default_serializer_->SerializedLen() .
This was a change in my PR, which I got lost when you fetched my changes.

if (auto comp = zstd_serializer_->Compress(payload); comp) {
if (auto comp = zstd_serializer_->Compress(payload)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this on purpose, see #508

Not sure what side you will join 🙂

In this case, with declaring a new variable, it probably doesn't matter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will undo this change

Comment on lines +147 to 149

// TODO : drop default_buffer from this class, we dont realy need it.
std::unique_ptr<io::StringFile> default_buffer_; // filled by default_serializer_
Copy link
Contributor

@dranikpg dranikpg Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, because currently we always move out of the default buffer

However, in an optimized version, we shouldn't do this and should keep the default buffer.

With it, we can make use of the following:

  • If we do compression, then we already copy the compressed part into a new string, so there is no reason the steal the default buffer and then drop it
  • We just use it do to copy over data from the rdb_serializer, so why should we allocate a new buffer each time for this if we end up with case 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, instead of using FlushToSink at all, we should be able to get the io::Bytes/string_view from the serializer directly, so that we can do compression without an intermediate copy

But I guess that's another topic, off from the current discussion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to move the compression under RdbSerilaizer, so the flow will change, you can see the PR created already.
I agree on the the optimization for not allocating new buffer for each copy, I believe the flow will change more till I get to this TODO but in the current flow I believe we can call it tmp_buffer_ and use it in the temporary Serializer as will, right?
Anyway I am not going to change this now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I just looked at this first, the other PR solves this issue

Comment on lines 255 to +256
bool SliceSnapshot::FlushDefaultBuffer(bool force) {
if (!force && default_buffer_->val.size() < 4096)
if (!force && default_serializer_->SerializedLen() < 4096)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

dranikpg
dranikpg previously approved these changes Dec 5, 2022
Signed-off-by: adi_holden <adi@dragonflydb.io>
@adiholden adiholden merged commit e803432 into main Dec 5, 2022
@romange romange deleted the fix_flush_default_buffer branch December 27, 2022 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants