JanusGraph

J

JanusGraph

Join the community to ask questions about JanusGraph and get answers from other members.

Join

Can `CqlInputFormat` do predicate pushdowns/query based prefilters?

Hi! First of all, thank you all for your work on JanusGraph. In my use case, I have a medium-large graph, ~3TB currently, might be 1-2 orders of magnitude bigger later. The data in it is generally clustered in a time-based fashion, e.g. newer vertices are mostly connected to other newer vertices (a timestamp is stored as a vertex property). I am writing an OLAP pipeline with Spark where JanusGraph, backed by Cassandra, is the source, and I use Tinkerpop's hadoop-gremlin to build vertex programs and run OLAP gremlin queries. Per my understanding, in this setup the only point of contact with JanusGraph is through the CqlInputFormat and the server itself is not involved at all. Is that correct?...

Performance issues with bulk loading

We've a JanusGraph cluster with Cassandra as the storage backend. Our cluster is deployed into a AWS EKS cluster. We've 32 JanusGraph pods with 2 cores and 16 GB and 9 Cassandra pods with 8 cores and 36 GB. We're using AWS Lambda and gremlin python library to load data in parallel. We've 20 concurrent Lambda invocations. We're loading data anywhere between 1 million to 20 million nodes (and at least that many edges if not more, hard to predict) in each run. Each Lambda invocation adds 500 nodes and associated relationships. The time it takes to load 500 nodes goes up as the load progresses especially when the number of nodes is closer to 20 million. A AWS Lambda can run for a maximum of 15 minutes. Several AWS Lambda invocations timeout especially towards the end of load. What can we do to improve the performance of the data loading. We've to load about 1 billions nodes and associated relationships. Our JanusGraph and Cassandra config is mostly default. Thanks....

When using janusgraph text predicates for fuzzy search, is it possible to control the fuzziness?

I am using text_contains_fuzzy method from janusgraph python library but I can't tell from the types whether the function accepts anything apart from the string value

Migration options for custom vertex id.

We are planning to use custom vertex id. I have huge data set that is already ingested with default vertex id and now I wanted to use custom vertex id. Any suggestion how can I migrate from default vertex id to custom vertex id ?

Best practices recommendation for Kotlin?

What, if any, are best practices recommendations when using the Java API from a Kotlin coroutine environment? Are there plans to support an asynchronous Java client in the future?

Edge cardinality SIMPLE documentation not clear

I am using edge cardinality as SIMPLE in my graph, and according to documentation: SIMPLE: Allows at most one edge of such label between any pair of vertices. In other words, the graph is a simple graph with respect to the label. Ensures that edges are unique for a given label and pairs of vertices. My question is, is direction considered here? Means can I have an edge A->B and B->A?...

ConfigurationManagementGraph fails with Must provide vertex id

I am new to the JanusGraph. Trying to create dynamic graphs. I couldn't create configuration template. Am I missing anything. Thanks My config: ` graphManager: org.janusgraph.graphdb.management.JanusGraphManager graphs: {...

Janusgraph bigtable rows exceeds the limit 256MiB when exported via Dataflow in Parquet format

Hi team, Currently, we are using Janusgraph with Bigtable as the storage backend. And we wanted to export the data out of Bigtable using Dataflow in a Parquet format to cloud storage. But during the process it failed because some of the rows size too large that exceeds the limit with the following error messages: See attachment...

custom vertex id (String) feature to avoid duplicate vertex

I have a graph(Backed with Cassandra and read/write consistency QUORUM) which will have Vertex property "recordId" and "type". I have disable consistency locking on property say "recordId" in my graph, I see duplicate vertex getting created for same "recordId", due to concurrent writes. Note - In above case we were not providing any custom vertex id, but relying in Janusgraph to generate vertex id. ...

Need some advice on using Edge Indexes efficiently.

I have a usecase where - I have to find a source vertex - From the source vertex, I need to find the edges that match certain filters. To find the source vertex, I can use a Graph Index(mixed index)....

Decoding graphindex values for debugging purpose

I need a help here to be able to decode graphindex table values, I see that it has three columns "key, column1, value", where PRIMARY KEY(key, column1) why column1 is kept as part of the key? Is there any documentation to understand this design? ...

Unable to use text predicates like 'textContains' in gremlin python

I know that it's already mentioned in the janusgraph documentation that these predicates are not supported by gremlin python because they aren't supported by tinkergraph either. I saw in this thread: https://discord.com/channels/981533699378135051/1176188416078139503 that we should write our own serializer and if it's the only way to get it working or is there any alternative...
Solution:
To use JanusGraph-specific predicates, you would need to swap gremlin-python with janusgraph-python I believe: https://github.com/JanusGraph/janusgraph-python#text-predicates Can you give this a try?...

Trying to connect to JanusGraph server/Cassandra with Kotlin client

I'm new to JanusGraph and attempting to connect to a remote server from a Kotlin client. I have created two Docker containers and everything looks good from the logs of JanusGraph:
docker run -d -p 7000:7000 -p 7001:7001 -p 7199:7199 -p 9042:9042 -p 9160:9160 -e CASSANDRA_START_RPC=true cassandra:4.0.6
docker run -d -p 7000:7000 -p 7001:7001 -p 7199:7199 -p 9042:9042 -p 9160:9160 -e CASSANDRA_START_RPC=true cassandra:4.0.6
```...

Best way to migrate JanusGraph data into another Janusgraph instance

Hi team, We have a usecase where we have to migrate the data between two JanusGraph instances. And we also planning to upgrade them as well. So currently we are using 0.6.4, and wanted to move the data into a new JG instance with version 1.0.0. The data size is around 4-7TB. What is the best way to do this? Thank you in advance....

Unable to run two instances of janusgraph using scylla as the storage backend in docker

I am trying to setup two different instances of janusgraph with scylla as the backend but I am not able to get it working.

Permission denied with (almost) default docker-compose.yml

When starting the JanusGraph container I get an "Operation not permitted" error specifically for the janusgraph-default-data (named) volume (the directory for BerkleyDB persistence). My docker-compose.yml file is as follows: ``` services: janusgraph: image: janusgraph/janusgraph:latest...

Metrics around cache usage

embedded janusgraph (java) - 0.6.3 storage - cassandra index - elasticsearch I enabled metrics and am collecting them using a custom reporter. Just wanted to know, are there any metrics for cache size and/or cache evictions ? The only ones I see are misses and retrievals...

Need advice on setting up janusgraph as a microservice

I am trying to build a recommendation system using janusgraph. I already have setup one janusgraph instance with an index backend for just the product and its related entities. Now, I need to add the users to the mix as well. But, I am confused as to whether introduce the User vertices in the product graph instance or should make a separate graph altogether. The reason for wanting a new instance for the users is that, there is a community feature that is included as well where creating personalities based on that user graph will becomes crucial....

What is the latest version of Cassandra is supported by 1.1.0-SNAPSHOT?

I am trying to run the latest SHANPSHOT version if JanusGraph against a Cassandra backend? The Cassandra DB I have running in a container is v5. What is the latest supported? I get the following error on the server startup:...

Syncing data between MongoDB and Janusgraph

My primary data source is MongoDB where collections are already. I need a way to sync the data from MongoDB to Jansugraph. The manual way of doing this is using MongoDB oplogs but I would like to know if there are any other better options before I get into that....