JanusGraph

J

JanusGraph

JanusGraph - Distributed, open source, massively scalable graph database.

Join

Changing default ES index name prefix

Hi, is it possible to change the default prefix of Elasticsearch index names? Currently, when I define a mixed index, JanusGraph creates a correspoing index in ES named janusgraph_<index_name>. I would like to change the prefix for all indices to something like jgtest1_<index_name>. I tried following the documentation (https://docs.janusgraph.org/index-backend/elasticsearch/#janusgraph-indexx-and-indexxelasticsearch-options), adding this line to the JanusGraph server .properties file: index.search.elasticsearch.index-name = jgtest1 however, this does not seem to work. ...
Solution:
Your configuration is generally correct except the index-name option. You added “elasticsearch” after “search” which isn’t a valid configuration option. Instead, try to use “index.search.index-name = jgidxtest”. Hopefully it should fix the issue you are having. Let me know otherwise.

Migrating from Bigtable to Cassandra

Hello! I want to explore migrating our Janusgraph storage from Bigtable to Cassandra. One suggestion is checking Google's Dataflow export to Parquet, but I'm not sure if the underlying storage schema would be the same. I'm hoping to not resort to a GraphSON export, because we want to keep the vertex IDs, and still have some showstoppers when testing JG 1.0...

Drop the Janus Graph Schema

Hello!! Hope you are having a good day! I have corrupted the schema of my Janusgraph and now I am not able to access my graph. Is there a way to drop the schema without deleting the data....

Idempotent upsert, is that possible?

For our project, we need to be able to insert vertexes and edges at a very high pace using spark streaming. After many tests, we found an approach that seems very promising. In our context, we could have sporadic vertex collisions, so instead of checking for the existence of a vertex before inserting it, we decided to use a custom ID. As an id, we use a hash generated from the vertex property; the id somehow represents the vertex value, so if two vertexes have the same property values, they hav...

Deserializing of Vertex ID with Custom String value

Hi, I've come across an issue with deserialization from users enabling custom vertex ID values and types as documented on https://docs.janusgraph.org/advanced-topics/custom-vertex-id/. Below is a sample GraphSON serialization error highlighting the issue, where a vertex has a custom string ID value (U933779): ```java.lang.IllegalArgumentException: Invalid id - each token expected to be a number...
Solution:
Hey, turns out it was a mismatch between driver and server version due to the user being on an older version of G.V(), i need to remember to make this the first thing i check 😅

TreeStep and MultiQuery support

On JanusGraph 1.0, a traversal like g.V().has(...).out(...).has(...).out(...).has(...) nicely leverages the MultiQuery optimisation and returns results in acceptable time. However, as soon as we add a tree() step, as in g.V().has(...).out(...).has(...).out(...).has(...).tree(), all MultiQuery optimisations are disabled and the traversal time increases drastically. Based on the following code, I think this applies to all Steps with PATH requirement (e.g. PathStep, TreeStep): https://github.com/JanusGraph/janusgraph/blob/v1.0/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/JanusGraphTraversalUtil.java#L393 ...

Deleting static Vertex Labels from the database

Hello, hope you are having a good day! I have a vertexLabel in my schema which is sort of corrupted. Whenever I call isStatic on the vertexLabel, then it fails. ```...

suggest an example to write spring-boot based rest api using janusgraph

My Json Data is documents is like below, please also suggest relevant data model in JanusGraph (i.e. edge & vertex & properties of vertex choices) As shown in below jsons, Lev1 is parent of Lev2, and Lev2 is parent of Lev3 and so on...

Using custom vertex IDs for import/export

Hi All, According to some tests done long long time ago, when exporting / import data using io.graphson.read / write, would not preseve vertex IDs. Will enabling cutom vertex IDs allow us to perserve vertex IDs?...
Solution:
Prior to JanusGraph 1.0.0, it's fixed. Starting from JanusGraph 1.0.0, it's global offline

Olap using spark cluster taking much more time than expected.

Hi All, We have setup a spark cluster to run olap queries on janusgraph with bigtable as storage backend. Details: ```Backend: Bigtable Vertices: ~4 Billion...

Reindexing using the Mgmt System

Hi all! we have an internal debate on how to best perform a reindex, after adding a new index. On JanusGraph 0.6, which of those options is preferred? and why? ```...

Deleting duplicate connections from the schema?

Hi all! I have a Java app working with a Cassandra-based JG, that checks the JG schema at startup and adds any missing elements via the JanusGraphManagement interface. Due to a bug that existed for a long time that app created thousands of duplicate connections between same node and edge labels via management.addConnection(). This has become a problem, because these connections are getting cached in the StandardSchemaCache which has unlimited size and started taking up all the heap. I'm looking for a way to safely delete the duplicated connections from the schema without dropping the schema and without disrupting other instances of the app working with this graph. Does anyone have experience with anything similar? I'm currently exploring the internals of JanusGraphManagement and ways to use the tx.query() interface to remove the unwanted relations, but I'd really appreciate any tips and ideas of an easier/safer solution....

Accelerating the vertex upsert

We need to accelerate the ingestion rate; the scenario is pretty typical. We could have repeated vertexes with new relationships. So, at each vertex insertion, we should check if it's already been inserted. Is there any particular recipe to accelerate this step? I would assume that this check would cause contention for maintaining consistency. We are considering introducing an external memory-based cache where we can accumulate all the vertex IDs and check the cache before hitting the DB. Any ot...

Janusgraph Tokenizer & Solr

I recently encountered what I believe is an incompatibility between the JanusGraph tokenizer being applied to queries before their submission to Solr. It appears this is uniquely only done to Solr in comparison to Elasticsearch. Moreover only for one particular predicate for Solr. Has anyone else bumped into this? Here's a link to my post on the listserve that gives more detail and links to the code in question: https://lists.lfaidata.foundation/g/janusgraph-users/message/6760...

JanusGraph 1.0 full-text search predicate in python - broken

Hi All, with JanusGraph 0.6 and gremlin-python 3.5.4, I was able to use the following in Python to use JanusGraph full-text search predicate: ----- from gremlin_python.process.traversal import P...
Solution:
The problem here is probably that JanusGraph used to serialize its text predicates as if they were TinkerPop text predicates, just with a value corresponding to the value of the JanusGraph text predicate, e.g., TextP.textContains() was serialized as if it were P.textContains(). That was changed in version 0.6.0 of JanusGraph to let JanusGraph serialize its predicates with a JanusGraph specific type identifier, but the server kept a fallback mechanism so it could still deserialize predicates sent that way: https://docs.janusgraph.org/changelog/#serialization-of-janusgraph-predicates-has-changed This fallback mechanism was then removed in JanusGraph 1.0.0: https://docs.janusgraph.org/changelog/#remove-support-for-old-serialization-format-of-janusgraph-predicates ...

Unable to use next() in gremlin-python

Hi, has anyone tried to use gremlin-python with janusgraph 1.0.0? It seems that there is a bug that makes next() unusable. Here is an example of how to reproduce the issue: ...
Solution:
This is my config https://github.com/Citegraph/citegraph/blob/main/backend/src/main/resources/gremlin-server-cql.yaml. I just tested python driver and java driver and they both worked well.

Speeding up node adding to Janusgraph

Hi Everybody, I am using Janusgraph with Berkley DB JE. In my use case I have to add initially nodes one by one. The number could be quite huge in certain cases, i.e., 200k-500k. It is currently taking quite some time, as compared to inmemory test setup. Have tried the following to expedite and got some improvement: .set("storage.berkeleyje.cache-percentage", CACHE_PERCENTAGE) .set("cache.db-cache",true) .set("cache.db-cache-size", DB_CACHE_SIZE)...

Splitting Backing ElasticSearch Index To Increase Primary Shards As JG Mixed Index Grows

Has anyone had to resize a ElasticSearch index that's backing a JanusGraph Mixed Index? Configuration wise it seems you're only able to convey to JanusGraph a singular primary shard & replica count when it creates an ES Index. I'm projecting to eventually have a couple Mixed Indices exceed what will be reasonable for a single primary shard in the backing ES Index (given the rule of thumb of 10-50GB or 200M documents). So as a configuration default it makes sense to leave it as 1 for the other Mixed Indices....

Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory

Hi everybody, we are noticing weird behavior of JanusGraph regarding the tables edgestore_lock_ and graphindex_lock_. We are operating two JanusGraph clusters which use the same schema, both running on ScyllaDB. While one instance is managed by JanusGraphFactory, we have configured multiple graphs in the other instance using ConfiguredGraphFactory. Recently, we noticed an unexpected storage usage caused by the table edgestore_lock_, so we started comparing the utilization of these tables for both scenarios: ```...
Solution:
The lock is acquired by StandardJanusGraph when vp[~T$VertexExists->true] is deleted. This only affects deletions because on additions, the vertex is always "new" https://github.com/JanusGraph/janusgraph/blob/06526e728f468bf7fca072c3cf2c5d9024830be0/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/StandardJanusGraph.java#L762

OLAP job failing with NullPointerException error

I am running an Apache Spark job on my graph and it fails with the below error: i am not sure what id : 525 is....
No description