dracule_redrose
dracule_redrose
ATApache TinkerPop
Created by dracule_redrose on 3/22/2024 in #questions
Serialization Issue
I have a weird error, when I am connecting with JanusGraph gremlin client using conf/remote-graph-binary.yaml I am able to get results. But when I am trying to use my java application I am getting, java.io.IOException: Serializer for custom type 'janusgraph.RelationIdentifier' not found. Googling around I got that this is due to serialization issue. It looks to me that the gremlin-client and my java application has similar configs but gremlin-client is not having any serialization problem.
hosts: [localhost]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
hosts: [localhost]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
Code setting up the serialization:
import org.apache.tinkerpop.gremlin.structure.io.binary.TypeSerializerRegistry;
import org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry;
import org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1;
...
TypeSerializerRegistry typeSerializerRegistry = TypeSerializerRegistry.build()
.addRegistry(JanusGraphIoRegistry.getInstance())
.create();

// Build cluster and connect client
Cluster cluster = Cluster.build(host)
.port(port)
.serializer(new GraphBinaryMessageSerializerV1(typeSerializerRegistry))
.maxConnectionPoolSize(1)
.minConnectionPoolSize(1)
.maxInProcessPerConnection(1)
.minSimultaneousUsagePerConnection(1)
.maxSimultaneousUsagePerConnection(1)
.create();
Client client = cluster.connect();
...
import org.apache.tinkerpop.gremlin.structure.io.binary.TypeSerializerRegistry;
import org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry;
import org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1;
...
TypeSerializerRegistry typeSerializerRegistry = TypeSerializerRegistry.build()
.addRegistry(JanusGraphIoRegistry.getInstance())
.create();

// Build cluster and connect client
Cluster cluster = Cluster.build(host)
.port(port)
.serializer(new GraphBinaryMessageSerializerV1(typeSerializerRegistry))
.maxConnectionPoolSize(1)
.minConnectionPoolSize(1)
.maxInProcessPerConnection(1)
.minSimultaneousUsagePerConnection(1)
.maxSimultaneousUsagePerConnection(1)
.create();
Client client = cluster.connect();
...
5 replies
ATApache TinkerPop
Created by dracule_redrose on 3/19/2024 in #questions
Design decision related to multiple heterogenous relational graphs
I'm working with over 100k instances of heterogeneous, relational node-and-edge attributed graphs, each graph having around 5k vertices and 10k edges. Vertices are of 3 types with 10 attributes (7 numerical, 3 string), and edges are of 5 types with 8 attributes (4 numerical, 4 string). Considering the complexity and size of the data, running queries like traversal paths, average clustering coefficients, and identifying nodes in clustering triangles across all these instances presents a significant challenge. I've been using a naive gremlin-server setup with an in-memory database to run my queries on one graph instance, but it's becoming clear that this approach isn't sustainable for multi-graph persistence or memory efficiency, as a single graph instance consumes about 1.2 GB of RAM. I'm exploring the possibility of switching to JanusGraph with a Berkeley DB backend to support persistent storage of multiple graphs (based on the feedback I got from the gremlin google group, https://groups.google.com/g/gremlin-users/c/UotOZFVvi3k/m/-hVd2oNNAQAJ). Given the data structure and requirements, especially the need for efficient loading and querying of individual graph instances in a possibly serializable fashion, do you think JanusGraph with Berkeley DB is a viable solution, or are there alternative approaches I should consider for managing and querying this volume of graph data effectively? I tried finding similar question, the closest matching question i found was https://discord.com/channels/838910279550238720/1087383361129037845, but was asking how to manage multiple graphs in gremlin-server.
9 replies