LazyBarrierStrategy/NoOpBarrierStep incompatible with path-tracking

👋🏻 Hi all! In this JanusGraph post (https://discord.com/channels/981533699378135051/1195313165278388334/1195313165278388334), we were investigating if TreeStep could be used jointly with bulked traversers so as to improve traversal time. Based on answers there, it looks like TinkerPop's LazyBarrierStrategy explicitly excludes "path-tracking" traversals (https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java#L85) and won't insert NoOpBarrierSteps in those cases, preventing us from bulking traversers. At a high level, I imagine this could be needed if traversers history was lost on bulking. Though, I'm not familiar enough with the code to be able to tell if this could concretely be a reason. I have observed some "path-tracking" traversals (with TreeStep) to run properly with bulking on JanusGraph (https://discord.com/channels/981533699378135051/1195313165278388334/1196486987474022400). In addition, commenting out the check on TraverserRequirement.PATH (https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java#L85) does not break gremlin-core unit tests locally. So my question is: would anyone have context on why "path-tracking" traversals can't use bulking? Is there any chance this exclusion could be scoped down to a subset of "path-tracking" traversals maybe? Any pointer appreciated! 🙇🏻‍♂️
10 Replies
spmallette
spmallette13mo ago
does not break gremlin-core unit tests locally.
i don't think the unit tests will tell you much. you will need to run the full build to find out what it breaks. mvn clean install the entire project and see what fails. essentially, you need gremlin-test to be built so that tinkergraph-gremlin can execute those tests over multiple configurations.
cdegroc
cdegrocOP13mo ago
Thanks Stephen. I actually ran unit tests for more than that, incl. Gremlin Test and TinkerGraph Gremlin.
[INFO] Apache TinkerPop ................................... SUCCESS [ 2.470 s]
[INFO] Apache TinkerPop :: Gremlin Language ............... SUCCESS [ 8.322 s]
[INFO] Apache TinkerPop :: Gremlin Shaded ................. SUCCESS [ 0.784 s]
[INFO] Apache TinkerPop :: Gremlin Core ................... SUCCESS [ 24.712 s]
[INFO] Apache TinkerPop :: Gremlin Annotations ............ SUCCESS [ 3.797 s]
[INFO] Apache TinkerPop :: Gremlin Test ................... SUCCESS [ 9.908 s]
[INFO] Apache TinkerPop :: TinkerGraph Gremlin ............ SUCCESS [01:43 min]
[INFO] Apache TinkerPop :: Gremlin Groovy ................. SUCCESS [01:09 min]
[INFO] Apache TinkerPop :: Gremlin Util ................... SUCCESS [ 3.524 s]
[INFO] Apache TinkerPop :: Gremlin Tools .................. SUCCESS [ 0.052 s]
[INFO] Apache TinkerPop :: Gremlin Socket Server .......... SUCCESS [ 5.887 s]
[INFO] Apache TinkerPop :: Gremlin Driver ................. SUCCESS [ 27.232 s]
[INFO] Apache TinkerPop :: Gremlin Server ................. SUCCESS [01:37 min]
[INFO] Apache TinkerPop :: Gremlin Python ................. SUCCESS [ 0.095 s]
[INFO] Apache TinkerPop :: Gremlin.Net .................... SUCCESS [ 0.565 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Source ........... SUCCESS [ 0.571 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Tests ............ SUCCESS [ 0.193 s]
[INFO] Apache TinkerPop :: Gremlin Go ..................... SUCCESS [ 0.062 s]
[INFO] Apache TinkerPop :: Hadoop Gremlin ................. SUCCESS [01:11 min]
[INFO] Apache TinkerPop :: Spark Gremlin .................. FAILURE [ 21.574 s]
...
[INFO] Apache TinkerPop ................................... SUCCESS [ 2.470 s]
[INFO] Apache TinkerPop :: Gremlin Language ............... SUCCESS [ 8.322 s]
[INFO] Apache TinkerPop :: Gremlin Shaded ................. SUCCESS [ 0.784 s]
[INFO] Apache TinkerPop :: Gremlin Core ................... SUCCESS [ 24.712 s]
[INFO] Apache TinkerPop :: Gremlin Annotations ............ SUCCESS [ 3.797 s]
[INFO] Apache TinkerPop :: Gremlin Test ................... SUCCESS [ 9.908 s]
[INFO] Apache TinkerPop :: TinkerGraph Gremlin ............ SUCCESS [01:43 min]
[INFO] Apache TinkerPop :: Gremlin Groovy ................. SUCCESS [01:09 min]
[INFO] Apache TinkerPop :: Gremlin Util ................... SUCCESS [ 3.524 s]
[INFO] Apache TinkerPop :: Gremlin Tools .................. SUCCESS [ 0.052 s]
[INFO] Apache TinkerPop :: Gremlin Socket Server .......... SUCCESS [ 5.887 s]
[INFO] Apache TinkerPop :: Gremlin Driver ................. SUCCESS [ 27.232 s]
[INFO] Apache TinkerPop :: Gremlin Server ................. SUCCESS [01:37 min]
[INFO] Apache TinkerPop :: Gremlin Python ................. SUCCESS [ 0.095 s]
[INFO] Apache TinkerPop :: Gremlin.Net .................... SUCCESS [ 0.565 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Source ........... SUCCESS [ 0.571 s]
[INFO] Apache TinkerPop :: Gremlin.Net - Tests ............ SUCCESS [ 0.193 s]
[INFO] Apache TinkerPop :: Gremlin Go ..................... SUCCESS [ 0.062 s]
[INFO] Apache TinkerPop :: Hadoop Gremlin ................. SUCCESS [01:11 min]
[INFO] Apache TinkerPop :: Spark Gremlin .................. FAILURE [ 21.574 s]
...
spmallette
spmallette13mo ago
well, perhaps the change causes trouble for OLAP given the spark failure? what were the test failures there?
cdegroc
cdegrocOP13mo ago
Gremlin Spark was failing due to my corp VPN (i.e. https://stackoverflow.com/questions/52133731/how-to-solve-cant-assign-requested-address-service-sparkdriver-failed-after). But now that I turned it off, all tests are passing. My local diff (compared to master branch) is just
diff --git a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
index 1a51ea0685..c8b96d88cd 100644
--- a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
+++ b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
@@ -82,7 +82,7 @@ public final class LazyBarrierStrategy extends AbstractTraversalStrategy<Travers
// which made it so that a Property is equal if the key/value is equal. as a result, they bulk together which
// is fine for almost all cases except when you wish to drop the property.
if (TraversalHelper.onGraphComputer(traversal) ||
- traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
+// traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
TraversalHelper.hasStepOfAssignableClass(DropStep.class, traversal)||
TraversalHelper.hasStepOfAssignableClass(ElementStep.class, traversal))
return;
diff --git a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
index 1a51ea0685..c8b96d88cd 100644
--- a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
+++ b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java
@@ -82,7 +82,7 @@ public final class LazyBarrierStrategy extends AbstractTraversalStrategy<Travers
// which made it so that a Property is equal if the key/value is equal. as a result, they bulk together which
// is fine for almost all cases except when you wish to drop the property.
if (TraversalHelper.onGraphComputer(traversal) ||
- traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
+// traversal.getTraverserRequirements().contains(TraverserRequirement.PATH) ||
TraversalHelper.hasStepOfAssignableClass(DropStep.class, traversal)||
TraversalHelper.hasStepOfAssignableClass(ElementStep.class, traversal))
return;
So it could also be that this is not tested or that another factor prevents us from hitting this condition.
spmallette
spmallette13mo ago
hmm interesting. that bit of code tends to be tricky. you move it one way and X breaks then you fix that and it makes Y break. i'm surprised that removing that condition had no effect at all as virtually every change to that strategy has come as a resul tof a bug that had a related test. i guess we'd want to see the origin of that change to see if ther is any particular hint as to why that is there. there may also be something in obvous i'm currently not thinking about as to why that strategy shouldn't execute if path tracking is enabled
cdegroc
cdegrocOP13mo ago
when you say "path tracking is enabled", you mean a step with TraverserRequirement.PATH is part of the path, right? Just to make sure I'm not misunderstanding.
spmallette
spmallette13mo ago
yes
cdegroc
cdegrocOP13mo ago
fi: thought I'd try mvn clean install -DskipIntegrationTests=false -DincludeNeo4j as well and it also succeeds 100% with the change
spmallette
spmallette13mo ago
that line of code goes back to the inception of the strategy itself. as a result, it makes me feel like there is some specific need for it and that we simply don't have the test that triggers the problem.
Lonnie VanZandt
Lonnie VanZandt13mo ago
No! Say it ain't so: as conceptually beautiful as is Gremlin on the outside, it cannot be just like real code on the inside. I am crushed.

Did you find this page helpful?