C
C#•14mo ago
Ole (ping please)

Deserializing binary data

I have the following structures:
Root:
int32 nameLength
byte[nameLength] name
int32 numItems
Item[numItems] Items

Item:
int32 itemNameLength
byte[itemNameLength] itemName
int32 VertexLength
Vertex[ArrayLength] VertexList

Vertex:
float X
float Y
float z
Root:
int32 nameLength
byte[nameLength] name
int32 numItems
Item[numItems] Items

Item:
int32 itemNameLength
byte[itemNameLength] itemName
int32 VertexLength
Vertex[ArrayLength] VertexList

Vertex:
float X
float Y
float z
I am curious, what is the best way to serialize this? Currently I simply read it into a stream and reads one attribute at the time. Coming from a c++ background this feels highly inefficent. When I try to google the problem I dont get a lot of new results, but I suspect there might be some changes to the language in the last 10 years which have improved tasks like this?
10 Replies
TheRanger
TheRanger•14mo ago
whats this, yaml?
Thinker
Thinker•14mo ago
Looks like a custom format? You'd probably have to write your own parser for it, although should be pretty simple
Ole (ping please)
Ole (ping please)OP•14mo ago
I just trying to explain the data layout. Its a custom format that there is no real documentation or name for. I have made a parser which basically reads ffield by field, but I am wondering if there is a "smarter" way of doing this. I have a lot of binary structures to serialize and deserialize so I am looking for improvments 🙂
Thinker
Thinker•14mo ago
Other than just general performance improvements (eg. using spans), you can't get much better than a hand-crafted parser There are libraries like Pidgin which allow you to write parsers in a simpler way, but you're still making a parser.
Ole (ping please)
Ole (ping please)OP•14mo ago
This is my current approach:
public class PluginPropertyValue
{
public uint propertyId { get; set; }
public byte rtpcAccum { get; set; }
public float fValue { get; set; }

public static PluginPropertyValue Create(ByteChunk chunk)
{
var instance = new PluginPropertyValue();

instance.propertyId = chunk.ReadUInt32();
instance.rtpcAccum = chunk.ReadByte();
instance.fValue = chunk.ReadSingle();

return instance;
}
}
public class PluginPropertyValue
{
public uint propertyId { get; set; }
public byte rtpcAccum { get; set; }
public float fValue { get; set; }

public static PluginPropertyValue Create(ByteChunk chunk)
{
var instance = new PluginPropertyValue();

instance.propertyId = chunk.ReadUInt32();
instance.rtpcAccum = chunk.ReadByte();
instance.fValue = chunk.ReadSingle();

return instance;
}
}
Where chunk is a helper around a memory stream
nukleer bomb
nukleer bomb•14mo ago
If you just need a binary (de)serializator, then you can use MemoryPack
nukleer bomb
nukleer bomb•14mo ago
GitHub
GitHub - Cysharp/MemoryPack: Zero encoding extreme performance bina...
Zero encoding extreme performance binary serializer for C# and Unity. - GitHub - Cysharp/MemoryPack: Zero encoding extreme performance binary serializer for C# and Unity.
wasabi
wasabi•14mo ago
A decent way to write these, that I usually end up doing, is to use a SequenceReader, and a series of record structs. I split it up like this. I have a ref struct named FooFormatReader. On that, I add some TryRead* methods, broken down into the basic operations of how the data is composed. FooFormatReader holds a reference of SequenceReader<byte>, and dispatches it's TryReadBar methods to that. Then I have a number of record structs, which represent the compound structures in the format. Those end up with TryRead(ref FooFormatReader reader, out FooRecord foo) that call the FooFormatReader, and initialize themselves.
wasabi
wasabi•14mo ago
jcotton42
jcotton42•14mo ago
I've done the same for reading Minecraft's network protocol, though I made them extension methods on SequenceReader the use of ref for the reader (or ref this if it's an ext method) is really important

Did you find this page helpful?