dracule_redrose
ATApache TinkerPop
•Created by dracule_redrose on 3/22/2024 in #questions
Serialization Issue
I have a weird error, when I am connecting with JanusGraph gremlin client using
conf/remote-graph-binary.yaml
I am able to get results. But when I am trying to use my java application I am getting, java.io.IOException: Serializer for custom type 'janusgraph.RelationIdentifier' not found
. Googling around I got that this is due to serialization issue. It looks to me that the gremlin-client and my java application has similar configs but gremlin-client is not having any serialization problem.
Code setting up the serialization:
5 replies
ATApache TinkerPop
•Created by dracule_redrose on 3/19/2024 in #questions
Design decision related to multiple heterogenous relational graphs
I'm working with over 100k instances of heterogeneous, relational node-and-edge attributed graphs, each graph having around 5k vertices and 10k edges. Vertices are of 3 types with 10 attributes (7 numerical, 3 string), and edges are of 5 types with 8 attributes (4 numerical, 4 string). Considering the complexity and size of the data, running queries like traversal paths, average clustering coefficients, and identifying nodes in clustering triangles across all these instances presents a significant challenge.
I've been using a naive gremlin-server setup with an in-memory database to run my queries on one graph instance, but it's becoming clear that this approach isn't sustainable for multi-graph persistence or memory efficiency, as a single graph instance consumes about 1.2 GB of RAM. I'm exploring the possibility of switching to JanusGraph with a Berkeley DB backend to support persistent storage of multiple graphs (based on the feedback I got from the gremlin google group, https://groups.google.com/g/gremlin-users/c/UotOZFVvi3k/m/-hVd2oNNAQAJ).
Given the data structure and requirements, especially the need for efficient loading and querying of individual graph instances in a possibly serializable fashion, do you think JanusGraph with Berkeley DB is a viable solution, or are there alternative approaches I should consider for managing and querying this volume of graph data effectively?
I tried finding similar question, the closest matching question i found was https://discord.com/channels/838910279550238720/1087383361129037845, but was asking how to manage multiple graphs in gremlin-server.
9 replies