C#•3y ago

Buffered download [Answered]

Hello, For example I can download a file

var client = new HttpClient();
var result = await client.GetByteArrayAsync(url);

var client = new HttpClient();
var result = await client.GetByteArrayAsync(url);

By doing that I write it all to memory. But what if the file is quite big (4gb+), how could I download in chunks while writing those to persistence?

17 Replies

Klarth•3y ago

Use streaming instead eg.

var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

//var result = await client.GetByteArrayAsync(uri);
//File.WriteAllBytes("demo.gif", result);

using var stream = await client.GetStreamAsync(uri);
using var output = new FileStream("demo.gif", FileMode.Create, FileAccess.Write, FileShare.None);
await stream.CopyToAsync(output);

var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

//var result = await client.GetByteArrayAsync(uri);
//File.WriteAllBytes("demo.gif", result);

using var stream = await client.GetStreamAsync(uri);
using var output = new FileStream("demo.gif", FileMode.Create, FileAccess.Write, FileShare.None);
await stream.CopyToAsync(output);

(Don't mind the URI, just for testing) That will get you around the allocations at least...I'm not sure how to get around resuming.

var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

var response = await client.GetAsync(uri);
var length = response.Content.Headers.ContentLength;

var chunks = new[] { (0, 200_000), (200_001, length) };

File.Delete("demo.gif");

for (int i = 0; i < chunks.Length; i++)
{
    client.DefaultRequestHeaders.Range = new(chunks[i].Item1, chunks[i].Item2);
    using var stream = await client.GetStreamAsync(uri);
    using var output = File.OpenWrite("demo.gif");
    output.Seek(0, SeekOrigin.End);
    await stream.CopyToAsync(output);
}

var uri = @"https://raw.githubusercontent.com/stevemonaco/AvaloniaDemos/master/BlockPatternAnimation/Assets/demoScreenCapture.gif";
var client = new HttpClient();

var response = await client.GetAsync(uri);
var length = response.Content.Headers.ContentLength;

var chunks = new[] { (0, 200_000), (200_001, length) };

File.Delete("demo.gif");

for (int i = 0; i < chunks.Length; i++)
{
    client.DefaultRequestHeaders.Range = new(chunks[i].Item1, chunks[i].Item2);
    using var stream = await client.GetStreamAsync(uri);
    using var output = File.OpenWrite("demo.gif");
    output.Seek(0, SeekOrigin.End);
    await stream.CopyToAsync(output);
}

And there's one that does it in chunks. You'll need to write the chunking range logic yourself though.

Fruity Mike•3y ago

thanks, I've found the first example working 🙂

Klarth•3y ago

Well, first sample doesn't chunk. It downloads the entire thing in one go.

Fruity Mike•3y ago

ummm

Klarth•3y ago

Second example downloads in two chunks, has better insight into resuming with some work, etc.

Fruity Mike•3y ago

so you're saying that when the stream is acquired actually file as a whole is downloaded?

Klarth•3y ago

We might have slightly different definitions of chunking here. The first one does download + write portions to disk as received...which might be ok. The second one uses multiple requests to the server to download the file in portions (doing the same write-as-downloaded as before for the particular chunk range).

Fruity Mike•3y ago

okay but still I have a question with the first example is there a moment of time whenever the file as whole is in memory? or is it just being downloaded and written to the file? chunk by chunk because for the second one it seems that there are total 2 chunks and thats all

Klarth•3y ago

They're both memory-friendly, so not entirely in memory. Not sure if there's only one buffer size worth (4096 bytes or so?) or if multiple...but it's not going to be a lot. Second still does write-as-you-go. So if the file were 40GB, it would stream 200KB then the remaining 39.9GB+. Only small portions of such in memory at a time.

Fruity Mike•3y ago

ohhh thanks a lot for clarifying you're just defining the size of the chunk my bad

Klarth•3y ago

If you more sensibly wrote the chunking to every 16MB for instance (or some other reasonable size)...then if you had a connection error, it would be easier to detect and resume without much lost progress. But if that functionality doesn't matter to your software, then it doesn't matter. 🤷‍♂️

Fruity Mike•3y ago

yeah, for now my concern is doing it in chunks

Anchy•3y ago

if you are going to do it in ranges make sure the CDN or server you are using is supporting ranges however

Fruity Mike•3y ago

being able to retry chunk retrieval might not be a concern whats the "buzzword" for this capability?

Klarth•3y ago

https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests You can look through that for more info if you do need to go to range requests.

Fruity Mike•3y ago

ummm thanks a lot you're really helped me 🙂

Accord•3y ago

✅ This post has been marked as answered!

Gaming

Programming

Buffered download [Answered]