jotalanusse
Processing files as fast as possible
Hi everyone! I have a question that I’m sure you can help me with. I’m making a library that needs to parse files as fast as possible. The files in question are binary files containing a sort of protobuf but not quite protobuf.
When parsing a file I only go forward through the file one time, and the data is not modified at all.
I want to know what is the fastest way to go through the file.
At the moment I’m using
MemoryStream
s, but when I want to process a part of what I just read I have to allocate a new byte[]
and copy the data to it, and I think this can be avoided.
99% of the files I handle won't ever exceed 150MB, maybe even 200MB, so memory isn’t a problem. But the fewer memory I need to allocate apart from the file I’m reading the better. The ideal scenario would be for me to only have one copy of the data in memory, and then be able to reference said data by creating read-only slices to process it.
Is this even possible? Is there a performance overhead for using MemoryStream
instead of byte[]
?39 replies