C
C#2y ago
Ben

Download file, decompress and keep in memory

I have an endpoint that serves a gzipped .tsv file. I want to download the file, decompress it and work on the data, without saving it on disk, keeping everything in memory. Tried -
HttpClient _client = new HttpClient(
new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
});

var data = await _client.GetStringAsync(url + fileName);
HttpClient _client = new HttpClient(
new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
});

var data = await _client.GetStringAsync(url + fileName);
Which eventually contains gibberish so I'll guess the decompression doesn't work? any ideas? If I download the file manually and extract it I get a folder with the .tsv file in it. maybe my issue is that I'm trying to read the string value of the folder and not its content? Iv'e tried removing the AutoDecompression configuration and it seems to have the same result. Thanks!
11 Replies
Ben
Ben2y ago
coach
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
IcyPhoenix
IcyPhoenix2y ago
Stack Overflow
How do you unzip a gz file in memory using GZipStream?
I'm probably doing something obviously stupid here. Please point it out! I have some C# code that is pulling down a bunch of .gz files from SFTP (using the SSH.NET Nuget package - works great!). E...
Ben
Ben2y ago
@IcyPhoenix @kaleb Yep your ideas worked, understood kaleb explanation. Never worked with streams before, I'll do some reading on the subject, ty! This is what I did -
var data= await _client.GetByteArrayAsync(url + fileName);

using (var fileStream = new MemoryStream(data))
{
fileStream.Seek(0, SeekOrigin.Begin);

using (var gzStream = new GZipStream(fileStream, CompressionMode.Decompress))
{
using (var outputStream = new MemoryStream())
{
gzStream.CopyTo(outputStream);
byte[] outputBytes = outputStream.ToArray();

var result = Encoding.ASCII.GetString(outputBytes);
}
}
}
var data= await _client.GetByteArrayAsync(url + fileName);

using (var fileStream = new MemoryStream(data))
{
fileStream.Seek(0, SeekOrigin.Begin);

using (var gzStream = new GZipStream(fileStream, CompressionMode.Decompress))
{
using (var outputStream = new MemoryStream())
{
gzStream.CopyTo(outputStream);
byte[] outputBytes = outputStream.ToArray();

var result = Encoding.ASCII.GetString(outputBytes);
}
}
}
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Ben
Ben2y ago
And by that to only decompress parts of the stream every time? can you show me a pseudo example ?
IcyPhoenix
IcyPhoenix2y ago
are you sure you can do this i looked into this awhile ago and i couldn't find a way to decompress only part of a zip from memory you had to have the whole uncompressed zip in memory
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Ben
Ben2y ago
Updating my database using batches of lets say 1000 rows at a time from the file. It shouldn't be over 10mb, so I don't really have an issue (this is why initially I wanted to avoid saving to disk etc.)
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Ben
Ben2y ago
@kaleb Got it! Thanks a lot kaleb