C
C#15mo ago
Abdesol

How to Dump and save objects to a file to load them later?

So, I was trying to use the Razorvine.Pickle library for this.. pickling them is working.. but unpickling is resulting in casting errors.. Is there any alternative I can use to save objects to a file? Thank you!
25 Replies
Abdesol
Abdesol15mo ago
sharpserializer didn't work either. I guess the object is unserializable.. it is the object from the faissmask class.. the class depends on the external dll called faiss.dll
jcotton42
jcotton4215mo ago
pickling corroButGrimacing could always roll the serialization yourself serialize what you need to reconstruct it
Abdesol
Abdesol15mo ago
the object that I am going to serialize is a highly dense thing. It is basically a memory vector database that has 1K+ dimensions.
JakenVeina
JakenVeina15mo ago
so, it'll be a big file ever heard of databases? storing lots of data in files is not a significantly difficult problem
Abdesol
Abdesol15mo ago
The vector data can be stored in a file.. but we need to load it in index everytime we want to run it.. and that takes a lot of time.. so, what we wanted is that, to save the state of the index object as it is to a file and load it later.. like we do in python using the pickle library.. I know it was an easy thing for python because it is dynamic language but, saving some other objects with pickle equivalent of python in C# was working.
Monsieur Wholesome
https://docs.python.org/3/library/pickle.html
pickle — Python object serialization The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. Warning The pickle module is not secure. Only unpickle data you trust
Python documentation
pickle — Python object serialization
Source code: Lib/pickle.py The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is...
Monsieur Wholesome
Do I dare typing out the C# equivalent I won't; It's hugely frowned upon <:DFrido_Derp_1:845975290768261150>
Abdesol
Abdesol15mo ago
I really don't have problem with the warning above for my use case.
jcotton42
jcotton4215mo ago
my understanding is that Pickle is equivalent to BinaryFormatter which comes with about 30 warning stickers
JakenVeina
JakenVeina15mo ago
I mean, serialization is serialization, whatever format you wanna use if you need something as optimal as possible due to space concerns, yeah, something binary is probably your best bet protobuf is probably a better bet than the old-fashioned BinaryFormatter regardless, like serialization isn't THAT complicated of a thing what's the actual problem here?
Abdesol
Abdesol15mo ago
Serialization is possible but this class has no constructor. So, serialization libs are popping up an error. Even if it had and serialization worked, it needs sometime for loading and training the index I mean the object..
JakenVeina
JakenVeina15mo ago
I mean I don't see how that problem is related to serialization the timing thing, that is
Abdesol
Abdesol15mo ago
Basically, what we are looking for is to adapt the way python's ML libs like sklearn, tensorflow, etc.. do it.. they save the model after it is trained.. and the model is basically an object of some class.. And we wanna do the samething in C#
JakenVeina
JakenVeina15mo ago
if you need initialization time, then... you need initialization time, and you need to go implement that so, like you wanna take models saved out of python, and pull them into .NET?
Abdesol
Abdesol15mo ago
No, we wanna create the object/model in .NET and save it with .NET
JakenVeina
JakenVeina15mo ago
okay so you have freedom over the data
Abdesol
Abdesol15mo ago
Basically, the faiss framework is one of the fastest vector similarity check, sort, etc... framework.
JakenVeina
JakenVeina15mo ago
if you're trying to use a serializer lib that requires a constructor, make one or pick a different one and model it to fit that
Abdesol
Abdesol15mo ago
Yeah.. I am thinking of that..
JakenVeina
JakenVeina15mo ago
it's not like there's a lot of variability in the different libs they all take roughly the same approach to serialization it's not that crazy of a topic
Abdesol
Abdesol15mo ago
Yea.. basically the decision is made not to dump the object to binary and use it straightforward with runtime.
JakenVeina
JakenVeina15mo ago
uhh what?
Abdesol
Abdesol15mo ago
But we were stilling looking for solution if there is any I meant, just to make the idea of saving object obsolete for now.
JakenVeina
JakenVeina15mo ago
okay well, yes, there's a solution there's lots of solutions serialization is baked into the BCL, as well as the WIDE variety of serialization libraries if this is an ML model, yeah, it's probably a decently-large chunk of data, but..... data is data model it and feed it through a serializer
Abdesol
Abdesol15mo ago
ohok, thanks for the info @ReactiveVeina Anyways, I appreciate the help.. let me close this thread for now 👍