JanusGraph

J

JanusGraph

Join the community to ask questions about JanusGraph and get answers from other members.

Join

Janusgraph-Cassandra Storage Backend issue and table clarification

Hello!! 1) I would like to know how the janusgraph stores the data in cassandra as storage backend what's the algo or class used to achieve this process (Java File) 2) How the data is actually stored in edgestore, graphindex tables in cassandra (the concept behind it or on what basis because i am not able to visibily see how the values or stored) it will be helpful if you explain it 3) the data are stored in blob is there any way to retrieve the data in fast manner when bulk data set (in lakhs) are stored in the cassandra-janusgraph database through java connectivity?...

Graph does not support the provided graph computer: SparkGraphComputer

Hi, I'm trying to instantiate a janusgraph server that correctly exports a graph.traversal().withComputer(SparkGraphComputer) As I was unable to find specific documentation on this, I have tried following this: https://stackoverflow.com/questions/45323666/unable-to-use-sparkgraphcomputer-with-tinkerpop-3-2-3-and-janusgraph-0-1-1-in-re ...
Solution:
OK, finally I was able to figure out the issue on my own. To make things work, since JanusGraphFactory fails to create a graph, I replaced
graphManager: org.janusgraph.graphdb.management.JanusGraphManager
graphManager: org.janusgraph.graphdb.management.JanusGraphManager
with...

Embedded graph with Scylla backend in Java

I'm trying to create an embedded graph in Java (using 11) connected to a Scylla cluster. I stood up the Scylla cluster with Docker using the Scylla University instructions linked to in the Janusgraph docs (https://university.scylladb.com/courses/the-mutant-monitoring-system-training-course/lessons/a-graph-data-system-powered-by-scylladb-and-janusgraph/). The cluster and the Janusgraph server is running just fine. I can connect to the server from a gremlin console and run scripts without issue. However, when I try to create an embedded graph in Java, I get errors indicating the connection cannot be made. To create the graph, I'm using JanusGraphFactory.build().set("storage.backend","cql").set("storage.hostname","172.18.0.3").open() The hostname IP is the IP for one of the Scylla nodes. The primary exception seems to be AllNodesFailedException: Could not reach any contact point, make sure you've provided a valid address....
Solution:
Update: I figured out what I was doing wrong. I needed to bind the port in the docker run command. Adding -p 9042:9042 to the docker command worked.

Wrote claim for id block threshold exception

I am using custom string vertex ids and recently found that below exceptions are coming intermittently:
Wrote claim for id block [9280001, 9360001] in PT1.2S => too slow, threshold is: PT0.3S
Wrote claim for id block [9280001, 9360001] in PT1.2S => too slow, threshold is: PT0.3S
I am using all default configurations for "ids" attribute. ...

Running OLAP queries on Janusgraph outside the Gremlin Console (from Java and G.V())

Hi, I'm able to run OLAP queries against my graph DB from the Gremlin Console, by following the directions provided here: https://docs.janusgraph.org/advanced-topics/hadoop/ However, I would like to also run OLAP queries without using the console, from an embedded Janusgraph Java application as well as from G.V(). In G.V(), I tried this while selecting Groovy Mode for query submission:...
Solution:
Thanks, @gdotv. It would be great if the JanusGraph folks can follow up on how to expse a GraphTraversalSource. In the meantime, I've been able to make progress on the question about using Java, by following this old-ish post by @Bo : https://li-boxuan.medium.com/spark-on-janusgraph-tinkerpop-a-pagerank-example-43950189b159 ...

Support for saving arrays of vectors

Hi, I know that JausGraph by default does not support read / write of arrays or vectors of floating point numbers. Is there a reason why?

Why queries are slow when more than one mixed index in query

ConfiguredGraphFactory.open("tenant51").traversal().V().has('_t', 'infra:container').has('_it', gt(123)).limit(100) is taking 400ms while ConfiguredGraphFactory.open("tenant51").traversal().V().has('_it', gt(123)).limit(100) or ConfiguredGraphFactory.open("tenant51").traversal().V().has('t', 'infra:container').limit(100) ...
No description

Index Creation Help

I need some help understanding the difference between Graph Index, Composite Index, and Vertex Centrix Index, and how to create them. I am currently working in Python but I am unsure on how to utilize these three different types of indices to speed up my queries, any suggestions/explanations would be greatly appreciated Looking at this page from a previous thread that I had created here (https://docs.janusgraph.org/schema/index-management/index-performance/#composite-index) I wasn't too sure where to execute some of the commands listed on the website. Is this going to work with Python?...

Transaction Recovery not working as expected

Hi everyone. I'm following the steps in the Transaction Failure section here Failure & Recovery - JanusGraph for handling when persistence to indexing backends fail. I've enabled the tx write-ahead log and have tried setting up the recovery process. Based on the logs, it seems that the process is initializing properly but I don't see much after that and I don't believe indexing is being retried based on queries using this index. I also see that getStatistics returns 0,0. I'm wondering if anyone had any insight into what's going on here/what I might be missing? We're using cassandra as our backend storage and lucene for indexing at the moment. Thank you!...

Mapping.STRING not working as expected?

Hi everyone! Based on the Janusgraph text search documentation: ``` When a string mapping is configured, the string value is indexed and can be queried "as-is" - including stop words and non-letter characters...

Problem with Custom Long IDs in JanusGraph

Hi, I'm experiencing issues with using custom Long IDs in JanusGraph. Although I'm aware of the limitations concerning signed long integers, I'm still facing problems with the range of IDs that can be utilized. Currently, I'm operating with two Cassandra machines as backend storage and a single JanusGraph machine. My database needs to handle at least 3 billion vertex nodes with unique IDs, and I'm trying to determine if this is feasible. I've experimented with various configuration settings, such as cluster.max-partitions and ids.authority.conflict-avoidance-bitwidth, attempting to expand the number of bits available for IDs, but without success....

Gremlin Console in v1.0.0 and v1.1.0

A weird behaviour i am facing in v1.0.0 and v1.1.0 v1.0.0 ``` gremlin> :remote connect tinkerpop.server conf/remote.yaml session...

Query for JANUSGRAPH_RELATION_DELIMITER

I have a use case in which i need to create multiple instances of janusgraph in a single service, and these instances are using different JANUSGRAPH_RELATION_DELIMITER. I have gone through the source code and found the class RelationIdentifier.java, where I can see that the property JANUSGRAPH_RELATION_DELIMITER is read from the environment variable and not from the configurationBuilder. Is this the only way to provide the delimiter variable? If no, then can you provide a workaround for this? ...

Custom Vertex ID and coalesce

I am trying to find a vertex by ID if it exists update a property otherwise create a vertex with the ID. `ConfiguredGraphFactory.open("tenant51").traversal(). V().hasId('45gjttOlN2+udTmQcJnHpp').fold() .coalesce(...

Can `CqlInputFormat` do predicate pushdowns/query based prefilters?

Hi! First of all, thank you all for your work on JanusGraph. In my use case, I have a medium-large graph, ~3TB currently, might be 1-2 orders of magnitude bigger later. The data in it is generally clustered in a time-based fashion, e.g. newer vertices are mostly connected to other newer vertices (a timestamp is stored as a vertex property). I am writing an OLAP pipeline with Spark where JanusGraph, backed by Cassandra, is the source, and I use Tinkerpop's hadoop-gremlin to build vertex programs and run OLAP gremlin queries. Per my understanding, in this setup the only point of contact with JanusGraph is through the CqlInputFormat and the server itself is not involved at all. Is that correct?...

Performance issues with bulk loading

We've a JanusGraph cluster with Cassandra as the storage backend. Our cluster is deployed into a AWS EKS cluster. We've 32 JanusGraph pods with 2 cores and 16 GB and 9 Cassandra pods with 8 cores and 36 GB. We're using AWS Lambda and gremlin python library to load data in parallel. We've 20 concurrent Lambda invocations. We're loading data anywhere between 1 million to 20 million nodes (and at least that many edges if not more, hard to predict) in each run. Each Lambda invocation adds 500 nodes and associated relationships. The time it takes to load 500 nodes goes up as the load progresses especially when the number of nodes is closer to 20 million. A AWS Lambda can run for a maximum of 15 minutes. Several AWS Lambda invocations timeout especially towards the end of load. What can we do to improve the performance of the data loading. We've to load about 1 billions nodes and associated relationships. Our JanusGraph and Cassandra config is mostly default. Thanks....

When using janusgraph text predicates for fuzzy search, is it possible to control the fuzziness?

I am using text_contains_fuzzy method from janusgraph python library but I can't tell from the types whether the function accepts anything apart from the string value

Migration options for custom vertex id.

We are planning to use custom vertex id. I have huge data set that is already ingested with default vertex id and now I wanted to use custom vertex id. Any suggestion how can I migrate from default vertex id to custom vertex id ?

Best practices recommendation for Kotlin?

What, if any, are best practices recommendations when using the Java API from a Kotlin coroutine environment? Are there plans to support an asynchronous Java client in the future?

Edge cardinality SIMPLE documentation not clear

I am using edge cardinality as SIMPLE in my graph, and according to documentation: SIMPLE: Allows at most one edge of such label between any pair of vertices. In other words, the graph is a simple graph with respect to the label. Ensures that edges are unique for a given label and pairs of vertices. My question is, is direction considered here? Means can I have an edge A->B and B->A?...
Next