Buffered download [Answered]

Hello, For example I can download a file
var client = new HttpClient();
var result = await client.GetByteArrayAsync(url);
var client = new HttpClient();
var result = await client.GetByteArrayAsync(url);
By doing that I write it all to memory. But what if the file is quite big (4gb+), how could I download in chunks while writing those to persistence?
17 Replies
Klarth
Klarth3y ago
Use streaming instead eg.
var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

//var result = await client.GetByteArrayAsync(uri);
//File.WriteAllBytes("demo.gif", result);

using var stream = await client.GetStreamAsync(uri);
using var output = new FileStream("demo.gif", FileMode.Create, FileAccess.Write, FileShare.None);
await stream.CopyToAsync(output);
var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

//var result = await client.GetByteArrayAsync(uri);
//File.WriteAllBytes("demo.gif", result);

using var stream = await client.GetStreamAsync(uri);
using var output = new FileStream("demo.gif", FileMode.Create, FileAccess.Write, FileShare.None);
await stream.CopyToAsync(output);
(Don't mind the URI, just for testing) That will get you around the allocations at least...I'm not sure how to get around resuming.
var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

var response = await client.GetAsync(uri);
var length = response.Content.Headers.ContentLength;

var chunks = new[] { (0, 200_000), (200_001, length) };

File.Delete("demo.gif");

for (int i = 0; i < chunks.Length; i++)
{
client.DefaultRequestHeaders.Range = new(chunks[i].Item1, chunks[i].Item2);
using var stream = await client.GetStreamAsync(uri);
using var output = File.OpenWrite("demo.gif");
output.Seek(0, SeekOrigin.End);
await stream.CopyToAsync(output);
}
var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

var response = await client.GetAsync(uri);
var length = response.Content.Headers.ContentLength;

var chunks = new[] { (0, 200_000), (200_001, length) };

File.Delete("demo.gif");

for (int i = 0; i < chunks.Length; i++)
{
client.DefaultRequestHeaders.Range = new(chunks[i].Item1, chunks[i].Item2);
using var stream = await client.GetStreamAsync(uri);
using var output = File.OpenWrite("demo.gif");
output.Seek(0, SeekOrigin.End);
await stream.CopyToAsync(output);
}
And there's one that does it in chunks. You'll need to write the chunking range logic yourself though.
Fruity Mike
Fruity MikeOP3y ago
thanks, I've found the first example working 🙂
Klarth
Klarth3y ago
Well, first sample doesn't chunk. It downloads the entire thing in one go.
Fruity Mike
Fruity MikeOP3y ago
ummm
Klarth
Klarth3y ago
Second example downloads in two chunks, has better insight into resuming with some work, etc.
Fruity Mike
Fruity MikeOP3y ago
so you're saying that when the stream is acquired actually file as a whole is downloaded?
Klarth
Klarth3y ago
We might have slightly different definitions of chunking here. The first one does download + write portions to disk as received...which might be ok. The second one uses multiple requests to the server to download the file in portions (doing the same write-as-downloaded as before for the particular chunk range).
Fruity Mike
Fruity MikeOP3y ago
okay but still I have a question with the first example is there a moment of time whenever the file as whole is in memory? or is it just being downloaded and written to the file? chunk by chunk because for the second one it seems that there are total 2 chunks and thats all
Klarth
Klarth3y ago
They're both memory-friendly, so not entirely in memory. Not sure if there's only one buffer size worth (4096 bytes or so?) or if multiple...but it's not going to be a lot. Second still does write-as-you-go. So if the file were 40GB, it would stream 200KB then the remaining 39.9GB+. Only small portions of such in memory at a time.
Fruity Mike
Fruity MikeOP3y ago
ohhh thanks a lot for clarifying you're just defining the size of the chunk my bad
Klarth
Klarth3y ago
If you more sensibly wrote the chunking to every 16MB for instance (or some other reasonable size)...then if you had a connection error, it would be easier to detect and resume without much lost progress. But if that functionality doesn't matter to your software, then it doesn't matter. 🤷‍♂️
Fruity Mike
Fruity MikeOP3y ago
yeah, for now my concern is doing it in chunks
Anchy
Anchy3y ago
if you are going to do it in ranges make sure the CDN or server you are using is supporting ranges however
Fruity Mike
Fruity MikeOP3y ago
being able to retry chunk retrieval might not be a concern whats the "buzzword" for this capability?
Klarth
Klarth3y ago
https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests You can look through that for more info if you do need to go to range requests.
Fruity Mike
Fruity MikeOP3y ago
ummm thanks a lot you're really helped me 🙂
Accord
Accord3y ago
✅ This post has been marked as answered!

Did you find this page helpful?