C
C#12mo ago
Lume

Read stream twice without missing data

I want to grab the first byte from a file and then read the entire file (max. 128 bytes) starting from the beginning, including that first byte. However, the problem is that the first byte I initially read is missing. Here's the code:
c#
var bytesRead = await file.ReadAsync(new byte[1]);

// Check bytesRead
// ...

file.Seek(0, SeekOrigin.Begin);

byte[] buffer = new byte[128];
await file.ReadAsync(buffer, 0, buffer.Length);
c#
var bytesRead = await file.ReadAsync(new byte[1]);

// Check bytesRead
// ...

file.Seek(0, SeekOrigin.Begin);

byte[] buffer = new byte[128];
await file.ReadAsync(buffer, 0, buffer.Length);
Any ideas on how to fix this?
51 Replies
canton7
canton712mo ago
Note that bytesRead is the number of bytes which were read, not the value of the first byte
Lume
LumeOP12mo ago
Yes, I know. That's intended
canton7
canton712mo ago
The first byte shouldn't be missing in your subsequent read though. I've done similar things plenty of times
Lume
LumeOP12mo ago
It does. Another weird case is that sometimes the second buffer only contains the first byte and the rest of the data is lost.
canton7
canton712mo ago
Note that Read reads at most the specified number of bytes. It's expected that it can read less (but if it reads 0, that means you've reached the end) There's ReadExactly if you want to read exactly a specified number of bytes
Lume
LumeOP12mo ago
That's the output with the code from above.
No description
canton7
canton712mo ago
And what values does the second file.ReadAsync return?
Lume
LumeOP12mo ago
This is the second one
canton7
canton712mo ago
What value does it return? I.e. the number of bytes read int bytesRead = await file.ReadAsync(buffer, 0, buffer.Length);
Lume
LumeOP12mo ago
1
canton7
canton712mo ago
Right, so all as expected Call ReadExactly, or call Read multiple times until either it returns 0, or you've got the total number of bytes you want (afk, back in half an hour)
Lume
LumeOP12mo ago
Not really 😅 . If I remove the file.Seek() call. I get everything I want except the first byte. Like so:
No description
Bailey
Bailey12mo ago
what about just setting the position file.position
Petris
Petris12mo ago
Does the location you're reading from support seeking?
Lume
LumeOP12mo ago
Yes The behaviour is the same.
Lume
LumeOP12mo ago
No description
canton7
canton712mo ago
No, Stream is document to work this way. There's no guarantee that it returns exactly the number of bytes you asked for
Petris
Petris12mo ago
Steam? Well is is if you use ReadExactly
canton7
canton712mo ago
Please, fix this bug. If you still see an issue, then let's investigate. But there's no point digging into one thing when we know you're doing something wrong
MODiX
MODiX12mo ago
canton7
There's ReadExactly if you want to read exactly a specified number of bytes
Quoted by
React with ❌ to remove this embed.
canton7
canton712mo ago
I've told him to use ReadExactly twice now, explicitly I think it's entirely expected the first 1-byte read will read 1 byte into an internal cache, and then when you rewind / re-read it will first read the entire cache (all 1 byte of it), then the next read will actually read new data from file (I'd have to re-read the code again to be sure)
Lume
LumeOP12mo ago
But I don't understand the why. And I also don't have access to ReadExactly, I am on .NET 6.
canton7
canton712mo ago
Why is because that's how Stream.Read is documented to work. And various Stream implementations take advantage of this for various efficiency reasons. See for example my previous message If you don't have ReadExactly, then you can do something like:
int totalRead = 0;
while (totalRead < buffer.Length)
{
int read = await stream.Read(buffer, totalRead, buffer.Length - totalRead);
if (read == 0)
{
// End of stream
break;
}
totalRead += read;
}
int totalRead = 0;
while (totalRead < buffer.Length)
{
int read = await stream.Read(buffer, totalRead, buffer.Length - totalRead);
if (read == 0)
{
// End of stream
break;
}
totalRead += read;
}
I'll remember how to do it eventually 😛
Lume
LumeOP12mo ago
Thank you.
Lume
LumeOP12mo ago
I believe there might be a misunderstanding between us. My intention is not to remove the trailing zeros from the byte array; that is perfectly acceptable. Instead, my goal is to read one byte, perform certain checks on it, and then read from the start of the original stream until it reaches a maximum of 128 bytes. Example: Stream: 1, 2, 3, 4, 5, 6, 7, 8, 9 var tmpBuffer = new byte[1]; Stream.Read(tmpBuffer) // tmpBuffer: 1 var buffer = new byte[128]; Stream.Read(buffer, 0, buffer.Length) // buffer: 2, 3, 4, 5, 6, 7, 8, 9 , 0, 0 , 0, ... This is expected because the "cursor" moves one position to the right side (right?).
canton7
canton712mo ago
Stream.Read(buffer, 0, 1) will only read 1 byte. I'm not sure why your buffer isn't 2, 0, 0, 0, ...
Lume
LumeOP12mo ago
Sorry I made a typo Now I thought why not just reset the "cursor" using Stream.Position = 0 to the starting position. And then if I read it using Stream.Read. The output should be the following: Stream.Read(buffer, 0, buffer.Length) // buffer: 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0 , 0, ...
canton7
canton712mo ago
Note that the second Stream.Read can return anywhere from 1 byte to 128 bytes. So your buffer might be 2, 0, 0, 0, ... or it might be 2, 3, 0, 0, ... or... You need to look at the return value to see how many bytes it actually read
Lume
LumeOP12mo ago
Yes, it depends on how many bytes I have in the byte array I i would like to read. The rest of the remaining bytes will be "filled up" with zeros. Correct?
canton7
canton712mo ago
No. It will read between 1 byte and the maximum number of bytes you requested. It will write to the section of the array that you tell it to (the second parameter). If it doesn't write to an array element, it just leaves it alone: it doesn't fill anything with zeros From the docs:
Implementations of this method read a maximum of buffer.Length bytes from the current stream and store them in buffer. The current position within the stream is advanced by the number of bytes read; however, if an exception occurs, the current position within the stream remains unchanged. Implementations return the number of bytes read. If more than zero bytes are requested, the implementation will not complete the operation until at least one byte of data can be read (if zero bytes were requested, some implementations may similarly not complete until at least one byte is available, but no data will be consumed from the stream in such a case). Read returns 0 only if zero bytes were requested or when there is no more data in the stream and no more is expected (such as a closed socket or end of file). An implementation is free to return fewer bytes than requested even if the end of the stream has not been reached.
Lume
LumeOP12mo ago
I mean this line fills the whole array with zeros: byte[] buffer = new byte[128]; So yeah makes sense How can I reset the position back to the start once I read 1 byte?
canton7
canton712mo ago
stream.Position = 0, or stream.Seek(0, SeekOrigin.Begin), as you have been doing
Lume
LumeOP12mo ago
Ok, but then I can only re-read the 1 byte that I already read and the rest of the stream is lost?
canton7
canton712mo ago
Please please listen to what I'm telling you, repeatedly. I even gave you the code to fix your problem The end of the stream is not lost. You simply need to call stream.Read again (and repeatedly) until it's read all the bytes you want to read This is not hard to understand. Please try and listen to what I'm telling you
Lume
LumeOP12mo ago
It's just weird that without using Seek it reads every byte (except first one, because cursor starts behind). 😅
canton7
canton712mo ago
Yes. This is internal buffering behaviour within FileStream. I even gave you the link to the code which does this, and the comment saying that this is how it behaves I've also quoted the documentation saying that Stream is allowed to behave in this way. I'm not sure what else I can do
Lume
LumeOP12mo ago
Thank you very much. I try to visualize it for better understanding it.
MODiX
MODiX12mo ago
canton7
I think it's entirely expected the first 1-byte read will read 1 byte into an internal cache, and then when you rewind / re-read it will first read the entire cache (all 1 byte of it), then the next read will actually read new data from file
Quoted by
React with ❌ to remove this embed.
Lume
LumeOP12mo ago
What I expected: ( | = cursor) Stream: | 1, 2, 3, 4, 5, 6, 7, 8 - Read 1 byte Stream: 1 | 2, 3, 4, 5, 6, 7, 8 - Reset to start using Seek Stream: | 1, 2, 3, 4, 5, 6, 7, 8 - Read the whole stream Stream: 1, 2, 3, 4, 5, 6, 7, 8 | What actually happens: Stream: | 1, 2, 3, 4, 5, 6, 7, 8 Internal cache: - Read 1 byte Stream: | 2, 3, 4, 5, 6, 7, 8 Internal cache: 1 - Reset to start using Seek Stream: | 1, EOF + 2, 3, 4, 5, 6, 7, 8 ( cache + stream) - Read the whole stream Stream: 1, EOF + | 2, 3, 4, 5, 6, 7, 8 So now I have to read it again to get the rest of it. Is this kinda accurate as a mental model?
canton7
canton712mo ago
I don't know where that internal "EOF" came from There's no "end of stream". EOF only happens when stream.Read returns 0, which doesn't happen here The best mental model is that the stream might be doing some caching internally, or it might be waiting for more data to arrive (e.g. over a pipe or network filesystem), and that therefore stream.Read will only give you bytes that it has readily available. Only if you call stream.Read and it doesn't have any bytes readily available will it go and look for more bytes to give you
Lume
LumeOP12mo ago
It serves as a separator. There is no End of File (EOF), and the internal cache is not concatenated with the stream. 😅
Lume
LumeOP12mo ago
This is the solution I came up with.
No description
canton7
canton712mo ago
No, that is wrong Because the second call to ReadAsync might only return 1 byte You need to call ReadAsync in a loop
Lume
LumeOP12mo ago
As long as I only get 1 byte it works
canton7
canton712mo ago
I've already given you the code which does exactly that It might work right now, but it's not guaranteed to work in the future
Lume
LumeOP12mo ago
Yep I will use your version. Thank you very much for your help.
canton7
canton712mo ago
What's so hard about this? I really don't get it. I've explained what Stream will and won't do. I've explained what you need to do. I've even shown you the code which works properly and will continue working in the future
Lume
LumeOP12mo ago
It's just my brain. Sorry!
canton7
canton712mo ago
And you're still doing things which I've previously explained are not guaranteed to work
Lume
LumeOP12mo ago
It just clicked in my head. Now, most things make sense! Thank you soo much! 🙏

Did you find this page helpful?