Deserializing binary data
I have the following structures:
I am curious, what is the best way to serialize this? Currently I simply read it into a stream and reads one attribute at the time. Coming from a c++ background this feels highly inefficent.
When I try to google the problem I dont get a lot of new results, but I suspect there might be some changes to the language in the last 10 years which have improved tasks like this?
10 Replies
whats this, yaml?
Looks like a custom format?
You'd probably have to write your own parser for it, although should be pretty simple
I just trying to explain the data layout. Its a custom format that there is no real documentation or name for.
I have made a parser which basically reads ffield by field, but I am wondering if there is a "smarter" way of doing this.
I have a lot of binary structures to serialize and deserialize so I am looking for improvments 🙂
Other than just general performance improvements (eg. using spans), you can't get much better than a hand-crafted parser
There are libraries like Pidgin which allow you to write parsers in a simpler way, but you're still making a parser.
This is my current approach:
Where chunk is a helper around a memory stream
If you just need a binary (de)serializator, then you can use MemoryPack
GitHub
GitHub - Cysharp/MemoryPack: Zero encoding extreme performance bina...
Zero encoding extreme performance binary serializer for C# and Unity. - GitHub - Cysharp/MemoryPack: Zero encoding extreme performance binary serializer for C# and Unity.
A decent way to write these, that I usually end up doing, is to use a SequenceReader, and a series of record structs.
I split it up like this. I have a ref struct named FooFormatReader. On that, I add some TryRead* methods, broken down into the basic operations of how the data is composed.
FooFormatReader holds a reference of SequenceReader<byte>, and dispatches it's TryReadBar methods to that.
Then I have a number of record structs, which represent the compound structures in the format. Those end up with TryRead(ref FooFormatReader reader, out FooRecord foo) that call the FooFormatReader, and initialize themselves.
Here's an example: https://github.com/ikvmnet/ikvm/blob/main/src/IKVM.ByteCode/Parsing/ClassRecord.cs
GitHub
ikvm/src/IKVM.ByteCode/Parsing/ClassRecord.cs at main · ikvmnet/ikvm
A Java Virtual Machine and Bytecode-to-IL Converter for .NET - ikvmnet/ikvm
I've done the same for reading Minecraft's network protocol, though I made them extension methods on SequenceReader
the use of
ref
for the reader (or ref this
if it's an ext method) is really important