Usage of _lock_ tables with ConfiguredGraphFactory vs. JanusGraphFactory
Hi everybody,
we are noticing weird behavior of JanusGraph regarding the tables
edgestore_lock_
and graphindex_lock_
. We are operating two JanusGraph clusters which use the same schema, both running on ScyllaDB. While one instance is managed by JanusGraphFactory
, we have configured multiple graphs in the other instance using ConfiguredGraphFactory
. Recently, we noticed an unexpected storage usage caused by the table edgestore_lock_
, so we started comparing the utilization of these tables for both scenarios:
As you can see, in the JanusGraphFactory
case, both _lock_
tables are hardly used. That is exactly what we expect, since we are not using any uniqueness constraints and only use MULTI
edges. As of our understanding, that should cause JanusGraph to not require locks.
In the ConfiguredGraphFactory
case, however, the edgestore_lock_
table seems to be heavily utilized, while graphindex_lock_
does not contain a single byte of data. As stated above, that feels weird, because both cases are using the same schema.
Does anybody have a deeper understanding of JanusGraph's usage of _lock_
tables? Sadly, there's little to no documentation on that topic. Is there a way to disable lock usage for ConfiguredGraphFactory
?Solution:Jump to solution
The lock is acquired by
StandardJanusGraph
when vp[~T$VertexExists->true]
is deleted. This only affects deletions because on additions, the vertex is always "new" https://github.com/JanusGraph/janusgraph/blob/06526e728f468bf7fca072c3cf2c5d9024830be0/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/StandardJanusGraph.java#L762GitHub
janusgraph/janusgraph-core/src/main/java/org/janusgraph/graphdb/dat...
JanusGraph: an open-source, distributed graph database - JanusGraph/janusgraph
9 Replies
You should be able to decode the
_lock_
table. IIRC every row records a timestamp from which you could see when it was written.
I don't know why ConfiguredGraphFactory
case has so much more usage. One possibility comes from the difference in schema. ConfiguredGraphFactory
uses a trick to store configs - it records each config option as a vertex property. Lock usage is always on for schema changes.Hi @Boxuan Li , thanks for your response! Indeed, we were able to decode the timestamps from the
_lock_
table and noticed that the few JanusGraphFactory
's locks date back to a short period back in 2020. Most probably, we ran some configuration experiments back then. This only confirms that JanusGraphFactory
doesn't use locks.
For ConfiguredGraphFactory
, the locks are much more recent and span longer time intervals. During our research, we figured out there's another difference between both databases: In the ConfiguredGraphFactory
case, we regularly drop vertices which we don't do with JanusGraphFactory
. I wonder if locking is somehow only employed when data is deleted. However, I can't find any indications in the JanusGraph code revealing a difference in the handling of additions and deletions within a transaction.Do you have
ConsistencyModifier.LOCK
for any label or index?
If you decode the KeyColumn
from the lock table, you could tell what exactly is being lockedChecking the schema was one of the first things we did.
ConsistencyModifier.LOCK
is not used. But we have another discovery to share: ScyllaDB gives us the ability to inspect read and write accesses on a table level using nodetool tablestats
. Using these insights, we were able to observe that edgestore_lock_
is only touched — be it read or write accesses — when we use drop()
queries. Everything else passes without the _lock_
table being accessed. The traversal is as minimal as it can be, leaving no options but the deletion to cause the lock usage: g.V(vertexId).drop().none().iterate()
.Solution
The lock is acquired by
StandardJanusGraph
when vp[~T$VertexExists->true]
is deleted. This only affects deletions because on additions, the vertex is always "new" https://github.com/JanusGraph/janusgraph/blob/06526e728f468bf7fca072c3cf2c5d9024830be0/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/StandardJanusGraph.java#L762GitHub
janusgraph/janusgraph-core/src/main/java/org/janusgraph/graphdb/dat...
JanusGraph: an open-source, distributed graph database - JanusGraph/janusgraph
It all leads us back to
ConsistencyModifier.LOCK
, as you suggested. T$VertexExists
has ConsistencyModifier.LOCK
because it is a builtin type.make sense, I vaguely remember deletion requires locks but couldn't find the code yesterday
I don't know if that makes sense or not, but that's the status quo
btw just curious: do you see a lot of stale locks that should have been purged?
Sorry for the late response, I didn't see your comment. Yes, most of the locks were stale and had been months old. How did you come up with the question? Did you expect that to be the case?
Sort of. I vaguely remember I saw the stale lock issue in production once a few years ago, but I never really dug into this.