cdegroc
cdegroc
Explore posts from servers
JJanusGraph
Created by johndisandonato on 11/15/2024 in #questions
Can `CqlInputFormat` do predicate pushdowns/query based prefilters?
Yes I think a joint effort w/ the TinkerPop community would be needed
8 replies
JJanusGraph
Created by johndisandonato on 11/15/2024 in #questions
Can `CqlInputFormat` do predicate pushdowns/query based prefilters?
Moreover, Hadoop's CqlInputFormat is an old class, which IIRC was deprecated in favor of https://github.com/datastax/spark-cassandra-connector That new project could be a better long-term solution.
8 replies
JJanusGraph
Created by johndisandonato on 11/15/2024 in #questions
Can `CqlInputFormat` do predicate pushdowns/query based prefilters?
👋🏻 Hey! We've used CqlInputFormat to dump entire graphs. I agree with your analysis. I think the WHERE clause you're refering to is https://github.com/JanusGraph/janusgraph/blob/v1.0/cassandra-hadoop-util/src/main/java/org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java#L61. I believe this is a CQL configuration option and not a JanusGraph one. Since JanusGraph encodes rows in its own binary format, I doubt this type of filtering would work well (Happy to be wrong though!).
8 replies
JJanusGraph
Created by karthikraju on 10/15/2024 in #questions
Unable to use text predicates like 'textContains' in gremlin python
To use JanusGraph-specific predicates, you would need to swap gremlin-python with janusgraph-python I believe: https://github.com/JanusGraph/janusgraph-python#text-predicates Can you give this a try?
5 replies
JJanusGraph
Created by b4lls4ck on 8/23/2024 in #questions
Speeding up Queries Made to JanusGraph
Thanks Florian for chiming in. I had overlooked that option!
31 replies
JJanusGraph
Created by b4lls4ck on 8/23/2024 in #questions
Speeding up Queries Made to JanusGraph
You can start Gremlin Console from a dedicated JanusGraph container or even JanusGraph server itself (https://docs.janusgraph.org/v0.3/basics/server/#connecting-to-gremlin-server)
# Start gremlin console
$ ./bin/gremlin.sh

# Connect to a remote JanusGraph server (configured in /etc/remote.yaml in this case)
gremlin> :remote connect tinkerpop.server /etc/remote.yaml session

# Enter remote console mode and send all commands to the server
gremlin> :remote console

# Now you can access open management
...
# Start gremlin console
$ ./bin/gremlin.sh

# Connect to a remote JanusGraph server (configured in /etc/remote.yaml in this case)
gremlin> :remote connect tinkerpop.server /etc/remote.yaml session

# Enter remote console mode and send all commands to the server
gremlin> :remote console

# Now you can access open management
...
31 replies
JJanusGraph
Created by b4lls4ck on 8/23/2024 in #questions
Speeding up Queries Made to JanusGraph
Technically, I think so (afaik). A grpc endpoint was added to JanusGraph 1.0 but does not yet support index management (https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-grpc#todo-1). But, it doesn't mean that you need to implement something yourself. You could start a Gremlin Console and update indexes from there using Groovy.
31 replies
JJanusGraph
Created by b4lls4ck on 8/23/2024 in #questions
Speeding up Queries Made to JanusGraph
I would suggest that you create a Graph Index first. Since your condition does an exact match on "Bob", you can use a Composite Index. Indexes cannot be created from Python (AFAIK) but only through JanusGraph management interface. We have sample commands in the doc showing how to create one: https://docs.janusgraph.org/schema/index-management/index-performance/#composite-index
31 replies
JJanusGraph
Created by b4lls4ck on 8/23/2024 in #questions
Speeding up Queries Made to JanusGraph
👋🏻 Hey. Your traversals g.V().has("person", "name", "Bob").outE("knows").has("weight", P.gte(0.5)).inV().values("name").toList() could probably benefit from different indexes. Without a Graph Index (https://docs.janusgraph.org/schema/index-management/index-performance/#graph-index), the first part g.V().has("person", "name", "Bob") has to filter through all vertices to find the ones with property value "Bob". Then, without a Vertex Centric Index (https://docs.janusgraph.org/schema/index-management/index-performance/#vertex-centric-indexes), JanusGraph needs to filter all of the matching vertices' "knows" out edges - .outE("knows") - to find the ones with a weight >= 0.5 - .has("weight", P.gte(0.5)).
31 replies
JJanusGraph
Created by cdegroc on 2/20/2024 in #questions
Concurrent updates during a REINDEX
Thank you both! 🙇🏻
16 replies
JJanusGraph
Created by cdegroc on 2/20/2024 in #questions
Concurrent updates during a REINDEX
Also, as a follow-up, my understanding of the code is that a REINDEX does not clear the existing index data and will only reindex what's currently in the backend's edgestore. Is that correct? I imagine for such cases the work in https://github.com/JanusGraph/janusgraph/issues/1099 could help.
16 replies
JJanusGraph
Created by cdegroc on 2/20/2024 in #questions
Concurrent updates during a REINDEX
Thanks Boxuan. I'm trying to link that to the code. IIUC this is because keys are iterated and SliceQueries are built/emitted on the fly as the job is making progress. Is that right?
16 replies
ATApache TinkerPop
Created by cdegroc on 1/17/2024 in #questions
LazyBarrierStrategy/NoOpBarrierStep incompatible with path-tracking
fi: thought I'd try mvn clean install -DskipIntegrationTests=false -DincludeNeo4j as well and it also succeeds 100% with the change
12 replies
JJanusGraph
Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
10 replies
ATApache TinkerPop
Created by cdegroc on 1/17/2024 in #questions
LazyBarrierStrategy/NoOpBarrierStep incompatible with path-tracking
when you say "path tracking is enabled", you mean a step with TraverserRequirement.PATH is part of the path, right? Just to make sure I'm not misunderstanding.
12 replies
ATApache TinkerPop
Created by cdegroc on 1/17/2024 in #questions
LazyBarrierStrategy/NoOpBarrierStep incompatible with path-tracking
Gremlin Spark was failing due to my corp VPN (i.e. https://stackoverflow.com/questions/52133731/how-to-solve-cant-assign-requested-address-service-sparkdriver-failed-after). But now that I turned it off, all tests are passing. My local diff (compared to master branch) is just
diff --git a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
index 1a51ea0685..c8b96d88cd 100644
--- a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
+++ b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
@@ -82,7 +82,7 @@ public final class LazyBarrierStrategy extends AbstractTraversalStrategy<Travers
// which made it so that a Property is equal if the key/value is equal. as a result, they bulk together which
// is fine for almost all cases except when you wish to drop the property.
if (TraversalHelper.onGraphComputer(traversal) ||
- traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
+// traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
TraversalHelper.hasStepOfAssignableClass(DropStep.class, traversal)||
TraversalHelper.hasStepOfAssignableClass(ElementStep.class, traversal))
return;
diff --git a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
index 1a51ea0685..c8b96d88cd 100644
--- a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
+++ b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
@@ -82,7 +82,7 @@ public final class LazyBarrierStrategy extends AbstractTraversalStrategy<Travers
// which made it so that a Property is equal if the key/value is equal. as a result, they bulk together which
// is fine for almost all cases except when you wish to drop the property.
if (TraversalHelper.onGraphComputer(traversal) ||
- traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
+// traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
TraversalHelper.hasStepOfAssignableClass(DropStep.class, traversal)||
TraversalHelper.hasStepOfAssignableClass(ElementStep.class, traversal))
return;
So it could also be that this is not tested or that another factor prevents us from hitting this condition.
12 replies
ATApache TinkerPop
Created by cdegroc on 1/17/2024 in #questions
LazyBarrierStrategy/NoOpBarrierStep incompatible with path-tracking
Thanks Stephen. I actually ran unit tests for more than that, incl. Gremlin Test and TinkerGraph Gremlin.
[INFO] Apache TinkerPop ................................... SUCCESS [ 2.470 s]
[INFO] Apache TinkerPop :: Gremlin Language ............... SUCCESS [ 8.322 s]
[INFO] Apache TinkerPop :: Gremlin Shaded ................. SUCCESS [ 0.784 s]
[INFO] Apache TinkerPop :: Gremlin Core ................... SUCCESS [ 24.712 s]
[INFO] Apache TinkerPop :: Gremlin Annotations ............ SUCCESS [ 3.797 s]
[INFO] Apache TinkerPop :: Gremlin Test ................... SUCCESS [ 9.908 s]
[INFO] Apache TinkerPop :: TinkerGraph Gremlin ............ SUCCESS [01:43 min]
[INFO] Apache TinkerPop :: Gremlin Groovy ................. SUCCESS [01:09 min]
[INFO] Apache TinkerPop :: Gremlin Util ................... SUCCESS [ 3.524 s]
[INFO] Apache TinkerPop :: Gremlin Tools .................. SUCCESS [ 0.052 s]
[INFO] Apache TinkerPop :: Gremlin Socket Server .......... SUCCESS [ 5.887 s]
[INFO] Apache TinkerPop :: Gremlin Driver ................. SUCCESS [ 27.232 s]
[INFO] Apache TinkerPop :: Gremlin Server ................. SUCCESS [01:37 min]
[INFO] Apache TinkerPop :: Gremlin Python ................. SUCCESS [ 0.095 s]
[INFO] Apache TinkerPop :: Gremlin.Net .................... SUCCESS [ 0.565 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Source ........... SUCCESS [ 0.571 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Tests ............ SUCCESS [ 0.193 s]
[INFO] Apache TinkerPop :: Gremlin Go ..................... SUCCESS [ 0.062 s]
[INFO] Apache TinkerPop :: Hadoop Gremlin ................. SUCCESS [01:11 min]
[INFO] Apache TinkerPop :: Spark Gremlin .................. FAILURE [ 21.574 s]
...
[INFO] Apache TinkerPop ................................... SUCCESS [ 2.470 s]
[INFO] Apache TinkerPop :: Gremlin Language ............... SUCCESS [ 8.322 s]
[INFO] Apache TinkerPop :: Gremlin Shaded ................. SUCCESS [ 0.784 s]
[INFO] Apache TinkerPop :: Gremlin Core ................... SUCCESS [ 24.712 s]
[INFO] Apache TinkerPop :: Gremlin Annotations ............ SUCCESS [ 3.797 s]
[INFO] Apache TinkerPop :: Gremlin Test ................... SUCCESS [ 9.908 s]
[INFO] Apache TinkerPop :: TinkerGraph Gremlin ............ SUCCESS [01:43 min]
[INFO] Apache TinkerPop :: Gremlin Groovy ................. SUCCESS [01:09 min]
[INFO] Apache TinkerPop :: Gremlin Util ................... SUCCESS [ 3.524 s]
[INFO] Apache TinkerPop :: Gremlin Tools .................. SUCCESS [ 0.052 s]
[INFO] Apache TinkerPop :: Gremlin Socket Server .......... SUCCESS [ 5.887 s]
[INFO] Apache TinkerPop :: Gremlin Driver ................. SUCCESS [ 27.232 s]
[INFO] Apache TinkerPop :: Gremlin Server ................. SUCCESS [01:37 min]
[INFO] Apache TinkerPop :: Gremlin Python ................. SUCCESS [ 0.095 s]
[INFO] Apache TinkerPop :: Gremlin.Net .................... SUCCESS [ 0.565 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Source ........... SUCCESS [ 0.571 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Tests ............ SUCCESS [ 0.193 s]
[INFO] Apache TinkerPop :: Gremlin Go ..................... SUCCESS [ 0.062 s]
[INFO] Apache TinkerPop :: Hadoop Gremlin ................. SUCCESS [01:11 min]
[INFO] Apache TinkerPop :: Spark Gremlin .................. FAILURE [ 21.574 s]
...
12 replies
JJanusGraph
Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
Thanks for your quick answer. I can see this requirement was added long ago. I will review TinkerPop tests, and then ask questions on the TinkerPop discord.
10 replies
JJanusGraph
Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
Even though unit tests are green, I imagine this could be breaking some traversal types I haven't tried or am not used to. @rngcntr, since you're the original author of this change (https://github.com/JanusGraph/janusgraph/pull/2516/files#diff-e1f91b256e6c63d882f9b043cbfa4d264c15299c52bae1b845dcd90b8beadabbR239-R252), would you remember why MultiQuery optimizations were disabled for Path-based traversals by any chance? 🙇🏻
10 replies
JJanusGraph
Created by cdegroc on 1/12/2024 in #questions
TreeStep and MultiQuery support
👋🏻 Hey. This worked and the traversal now leverages multiQuery, resulting in a nice performance improvement in my tests.
10 replies