Remove extra GET for tarball uncompressed size#1812
Remove extra GET for tarball uncompressed size#1812chriskrycho merged 1 commit intovolta-cli:mainfrom
Conversation
Currently, we make two overlapping GET requests when fetching a tarball: First, the main request to get the headers and the data stream, and then a second request to attempt to fetch the ISIZE field which holds the uncompressed size. Getting the uncompressed size lets us use the stream data after uncompressing to populate our progress bar for tarballs. However, the extra request causes issues for some users, where the second request for the ISIZE field hangs during the TLS handshake. By removing the uncompressed size entirely and connecting the progress bar to the compressed read data, we can avoid the extra request entirely.
| let decoded = GzDecoder::new(ProgressRead::new(self.data, (), progress)); | ||
| let mut tarball = tar::Archive::new(decoded); |
There was a problem hiding this comment.
Just for my own clarity: what motivated changing up where we put the ProgressRead? It seems like it should be the same behavior either way, since ultimately in both cases it’s just wrapping the Read stream?
There was a problem hiding this comment.
Yeah, I can definitely explain! For some background:
ProgressReadis a thin wrapper around a reader that calls the providedFnMutevery time data is read, passing in the amount of data that was read from the wrapped reader.GzDecoderis aReadwrapper around a stream of compressed bytes, which produces a stream of uncompressed bytes.
Given that, the motivation behind the change is that previously we were wrapping the decoded reader with ProgressRead, meaning the number of bytes it would report would be from the stream of uncompressed bytes—they would add up to the total uncompressed size. Since we're no longer fetching / relying on the uncompressed size, we instead move the ProgressRead to wrap the stream of compressed bytes, which allows the total amount of data recorded to match the compressed size.
chriskrycho
left a comment
There was a problem hiding this comment.
This looks eminently reasonable, and local testing on my Mac seems to be working well with no visual hiccups.
(Intuitively, this makes a lot of sense to me: decoding the file should more or less always be dwarfed by the download time, and the download time is going to be a function of the compressed, not the uncompressed, size.)
I had one question, but it’s absolutely not blocking. ![]()
Fixes #1744
Info
Currently, we make two overlapping GET requests when fetching a tarball: First, the main request to get the headers and the data stream, and then a second request to attempt to fetch the ISIZE field which holds the uncompressed size. Getting the uncompressed size lets us use the stream data after uncompressing to populate our progress bar for tarballs.
However, the extra request causes issues for some users, where the second request for the ISIZE field hangs during the TLS handshake. By removing the uncompressed size entirely and connecting the progress bar to the compressed read data, we can avoid the extra request entirely.
Changes
uncompressed_sizemethod from theArchivetrait.Tarballimplementation ofArchiveTarballto emit progress off the compressed reader, rather than the output of the decoder (so that progress values will be based on the compressed size).Tested
Notes