rngcntr
rngcntr
JJanusGraph
Created by dgreco on 1/15/2024 in #questions
Idempotent upsert, is that possible?
Custom vertex IDs weren't a thing at the time, but it was the same issue when multiple edges were added to the same vertex simultaneously. Maybe @Florian Hockmann can tell more
13 replies
JJanusGraph
Created by dgreco on 1/15/2024 in #questions
Idempotent upsert, is that possible?
With Cassandra/ScyllaDB backends, we encountered ghost vertices when multiple clients issued writes to the same vertex at the same time. Doesn't that happen to you? Things may have changed because our research on that topic was a few years back
13 replies
JJanusGraph
Created by dgreco on 1/15/2024 in #questions
Idempotent upsert, is that possible?
The approach of using a hash as a custom ID sounds promising. What kind of hash do you use? At a pace of 600k vertices per second, I would be concerned about hash collisions appearing after only a few hours of operation.
13 replies
JJanusGraph
Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
I can't tell anymore if I actually managed to figure out why NoOpBarrierStep is not allowed in path tracking traversals or not. But since that's part of TinkerPop, there may be test cases in their repository that should fail if the check in LazyBarrierStrategy is dropped.
10 replies
JJanusGraph
Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
Hi @Clément de Groc ! The reasoning is explained in the Javadoc (https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/JanusGraphTraversalUtil.java#L385-L390): Similar to NoOpBarrierStep, the MultiQueryStep s purpose is to aggregate traversers before handling them and passing results to the next step. Not having path tracking enabled is a hard requirement for TinkerPop's NoOpBarrierStep (https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java#L46) so to be safe, I applied that requirement to MultiQueryStep as well.
10 replies
JJanusGraph
Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Sorry for the late response, I didn't see your comment. Yes, most of the locks were stale and had been months old. How did you come up with the question? Did you expect that to be the case?
15 replies
JJanusGraph
Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
It all leads us back to ConsistencyModifier.LOCK, as you suggested. T$VertexExists has ConsistencyModifier.LOCK because it is a builtin type.
15 replies
JJanusGraph
Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
The lock is acquired by StandardJanusGraph when vp[~T$VertexExists->true] is deleted. This only affects deletions because on additions, the vertex is always "new" https://github.com/JanusGraph/janusgraph/blob/06526e728f468bf7fca072c3cf2c5d9024830be0/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/StandardJanusGraph.java#L762
15 replies
JJanusGraph
Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Checking the schema was one of the first things we did. ConsistencyModifier.LOCK is not used. But we have another discovery to share: ScyllaDB gives us the ability to inspect read and write accesses on a table level using nodetool tablestats. Using these insights, we were able to observe that edgestore_lock_ is only touched — be it read or write accesses — when we use drop() queries. Everything else passes without the _lock_ table being accessed. The traversal is as minimal as it can be, leaving no options but the deletion to cause the lock usage: g.V(vertexId).drop().none().iterate().
15 replies
JJanusGraph
Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Hi @Boxuan Li , thanks for your response! Indeed, we were able to decode the timestamps from the _lock_ table and noticed that the few JanusGraphFactory's locks date back to a short period back in 2020. Most probably, we ran some configuration experiments back then. This only confirms that JanusGraphFactory doesn't use locks. For ConfiguredGraphFactory, the locks are much more recent and span longer time intervals. During our research, we figured out there's another difference between both databases: In the ConfiguredGraphFactory case, we regularly drop vertices which we don't do with JanusGraphFactory. I wonder if locking is somehow only employed when data is deleted. However, I can't find any indications in the JanusGraph code revealing a difference in the handling of additions and deletions within a transaction.
15 replies
ATApache TinkerPop
Created by porunov on 5/26/2023 in #questions
Does bulking optimization provided by LazyBarrierStrategy improves query performance?
In relational databases, these types of aggregation operators are quite common as well. The principle is most commonly referred to as "vectorization" and helps efficiently utilizing CPU caches. Vectorization is most helpful for CPU or Memory-intensive workloads. In JanusGraph however, most of the query evaluation is probably spent waiting for network traffic from the storage backend. I suppose that's why these barrier steps are not helpful in most queries.
9 replies
JJanusGraph
Created by Bo on 5/25/2023 in #questions
JanusGraph stuck in SchemaStatus.INSTALLED status
Thanks for the info. I have checked the code and provided a quick fix
4 replies