rngcntr
JJanusGraph
•Created by dgreco on 1/15/2024 in #questions
Idempotent upsert, is that possible?
Custom vertex IDs weren't a thing at the time, but it was the same issue when multiple edges were added to the same vertex simultaneously.
Maybe @Florian Hockmann can tell more
13 replies
JJanusGraph
•Created by dgreco on 1/15/2024 in #questions
Idempotent upsert, is that possible?
With Cassandra/ScyllaDB backends, we encountered ghost vertices when multiple clients issued writes to the same vertex at the same time. Doesn't that happen to you?
Things may have changed because our research on that topic was a few years back
13 replies
JJanusGraph
•Created by dgreco on 1/15/2024 in #questions
Idempotent upsert, is that possible?
The approach of using a hash as a custom ID sounds promising. What kind of hash do you use? At a pace of 600k vertices per second, I would be concerned about hash collisions appearing after only a few hours of operation.
13 replies
JJanusGraph
•Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
I can't tell anymore if I actually managed to figure out why
NoOpBarrierStep
is not allowed in path tracking traversals or not. But since that's part of TinkerPop, there may be test cases in their repository that should fail if the check in LazyBarrierStrategy
is dropped.10 replies
JJanusGraph
•Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
Hi @Clément de Groc ! The reasoning is explained in the Javadoc (https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/JanusGraphTraversalUtil.java#L385-L390): Similar to
NoOpBarrierStep
, the MultiQueryStep
s purpose is to aggregate traversers before handling them and passing results to the next step. Not having path tracking enabled is a hard requirement for TinkerPop's NoOpBarrierStep
(https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java#L46) so to be safe, I applied that requirement to MultiQueryStep
as well.10 replies
JJanusGraph
•Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Sorry for the late response, I didn't see your comment. Yes, most of the locks were stale and had been months old. How did you come up with the question? Did you expect that to be the case?
15 replies
JJanusGraph
•Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
It all leads us back to
ConsistencyModifier.LOCK
, as you suggested. T$VertexExists
has ConsistencyModifier.LOCK
because it is a builtin type.15 replies
JJanusGraph
•Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
The lock is acquired by
StandardJanusGraph
when vp[~T$VertexExists->true]
is deleted. This only affects deletions because on additions, the vertex is always "new" https://github.com/JanusGraph/janusgraph/blob/06526e728f468bf7fca072c3cf2c5d9024830be0/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/StandardJanusGraph.java#L76215 replies
JJanusGraph
•Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Checking the schema was one of the first things we did.
ConsistencyModifier.LOCK
is not used. But we have another discovery to share: ScyllaDB gives us the ability to inspect read and write accesses on a table level using nodetool tablestats
. Using these insights, we were able to observe that edgestore_lock_
is only touched — be it read or write accesses — when we use drop()
queries. Everything else passes without the _lock_
table being accessed. The traversal is as minimal as it can be, leaving no options but the deletion to cause the lock usage: g.V(vertexId).drop().none().iterate()
.15 replies
JJanusGraph
•Created by rngcntr on 11/10/2023 in #questions
Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Hi @Boxuan Li , thanks for your response! Indeed, we were able to decode the timestamps from the
_lock_
table and noticed that the few JanusGraphFactory
's locks date back to a short period back in 2020. Most probably, we ran some configuration experiments back then. This only confirms that JanusGraphFactory
doesn't use locks.
For ConfiguredGraphFactory
, the locks are much more recent and span longer time intervals. During our research, we figured out there's another difference between both databases: In the ConfiguredGraphFactory
case, we regularly drop vertices which we don't do with JanusGraphFactory
. I wonder if locking is somehow only employed when data is deleted. However, I can't find any indications in the JanusGraph code revealing a difference in the handling of additions and deletions within a transaction.15 replies
ATApache TinkerPop
•Created by porunov on 5/26/2023 in #questions
Does bulking optimization provided by LazyBarrierStrategy improves query performance?
In relational databases, these types of aggregation operators are quite common as well. The principle is most commonly referred to as "vectorization" and helps efficiently utilizing CPU caches. Vectorization is most helpful for CPU or Memory-intensive workloads. In JanusGraph however, most of the query evaluation is probably spent waiting for network traffic from the storage backend. I suppose that's why these barrier steps are not helpful in most queries.
9 replies
JJanusGraph
•Created by Bo on 5/25/2023 in #questions
JanusGraph stuck in SchemaStatus.INSTALLED status
Thanks for the info. I have checked the code and provided a quick fix
4 replies